Who With Whom And How?: Extracting Large Social Networks Using Search Engines
Stefan Siersdorfer, Philipp Kemkes, Hanno Ackermann, Sergej Zerr

TL;DR
This paper presents scalable query-based methods for extracting large social networks from web data using search engines, employing pattern-based retrieval and bootstrapping to improve efficiency and quality.
Contribution
It introduces novel scalable algorithms for social network extraction from web data, leveraging pattern-based search and iterative expansion techniques.
Findings
High-quality social graphs can be extracted efficiently from large web data.
The proposed methods outperform previous approaches in scalability and accuracy.
Experimental results demonstrate effectiveness across different domains.
Abstract
Social network analysis is leveraged in a variety of applications such as identifying influential entities, detecting communities with special interests, and determining the flow of information and innovations. However, existing approaches for extracting social networks from unstructured Web content do not scale well and are only feasible for small graphs. In this paper, we introduce novel methodologies for query-based search engine mining, enabling efficient extraction of social networks from large amounts of Web data. To this end, we use patterns in phrase queries for retrieving entity connections, and employ a bootstrapping approach for iteratively expanding the pattern set. Our experimental evaluation in different domains demonstrates that our algorithms provide high quality results and allow for scalable and efficient construction of social graphs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Complex Network Analysis Techniques · Caching and Content Delivery
