Penerapan teknik web scraping pada mesin pencari artikel ilmiah
Ahmad Josi, Leon Andretti Abdillah, Suryayusra

TL;DR
This paper discusses the application of web scraping techniques to extract data from scientific article search engines like Garuda, ISJD, and Google Scholar, highlighting methods for developing effective scraping tools.
Contribution
It presents a specific approach to web scraping for scientific search engines, focusing on extracting data from HTML structures of various platforms.
Findings
Successfully extracted data from multiple scientific search engines.
Developed a web scraping method tailored for academic databases.
Demonstrated the feasibility of automated data retrieval from scholarly sources.
Abstract
Search engines are a combination of hardware and computer software supplied by a particular company through the website which has been determined. Search engines collect information from the web through bots or web crawlers that crawls the web periodically. The process of retrieval of information from existing websites is called "web scraping." Web scraping is a technique of extracting information from websites. Web scraping is closely related to Web indexing, as for how to develop a web scraping technique that is by first studying the program makers HTML document from the website will be taken to the information in the HTML tag flanking the aim is for information collected after the program makers learn navigation techniques on the website information will be taken to a web application mimicked the scraping that we will create. It should also be noted that the implementation of this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Physics and Python Applications · Web Data Mining and Analysis · Edcuational Technology Systems
