Distributed NLP
Galip Aydin, Ibrahim Riza Hallac

TL;DR
This paper evaluates the performance of parallel text processing for Turkish scientific papers using Map Reduce on a Hadoop cloud platform, comparing it with single-machine processing to assess scalability and efficiency.
Contribution
It demonstrates the application of Map Reduce for NLP tasks in Turkish and provides performance insights on cloud-based versus single-machine processing.
Findings
Parallel processing improves performance on large Turkish NLP tasks.
Hadoop cluster outperforms single-machine processing in speed.
Scalability benefits are demonstrated for NLP in Turkish.
Abstract
In this paper we present the performance of parallel text processing with Map Reduce on a cloud platform. Scientific papers in Turkish language are processed using Zemberek NLP library. Experiments were run on a Hadoop cluster and compared with the single machines performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Natural Language Processing Techniques · Data Mining Algorithms and Applications
