Web Usage mining framework for Data Cleaning and IP address Identification
Priyanka Verma, Nishtha Kesswani

TL;DR
This paper presents a framework for web usage mining focusing on data cleaning and IP address identification to improve user pattern analysis from web logs.
Contribution
It proposes new methodologies for data cleaning and IP address identification in web log preprocessing, enhancing user behavior analysis accuracy.
Findings
Number of users identified after IP address processing.
Improved data quality for web usage mining.
Enhanced accuracy in user pattern detection.
Abstract
The World Wide Web is the most wide known information source that is easily available and searchable. It consists of billions of interconnected documents Web pages are authored by millions of people. Accesses made by various users to pages are recorded inside web logs. These log files exist in various formats. Because of increase in usage of web, size of web log files is increasing at a much faster rate. Web mining is application of data mining technique to these log files. It can be of three types Web usage mining, Web structure mining and Web content mining. Web Usage mining is mining of usage patterns of users which can then be used to personalize web sites and create attractive web sites. It consists of three main phases: Preprocessing, Pattern discovery and Pattern analysis. In this paper we focus on Data cleaning and IP Address identification stages of preprocessing. Methodology…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Recommender Systems and Techniques · Caching and Content Delivery
