ArabicNLP Dataset

less than 1 minute read

Published:

 URL
Papershttps://arxiv.org/ftp/arxiv/papers/1702/1702.07835.pdf
Sentiment Analysishttp://archive.ics.uci.edu/ml/datasets/Twitter+Data+set+for+Arabic+Sentiment+Analysis
Classificationhttp://archive.ics.uci.edu/ml/datasets/Opinion+Corpus+for+Lebanese+Arabic+Reviews+%28OCLAR%29
Wikipediahttps://archive.org/details/arwiki-20190201
Multiplehttps://lionbridge.ai/datasets/best-arabic-datasets-for-machine-learning/
 https://github.com/ibnmalik/golden-corpus-arabic/tree/develop/core
 https://github.com/linuxscout/tashkeela2
Diacritizationhttps://github.com/AliOsm/arabic-text-diacritization/tree/master/dataset
 http://tanzil.net/download
 https://www.kaggle.com/datasets/linuxscout/tashkeela?resource=download
Crawlshttps://traces1.inria.fr/oscar/
 https://github.com/alisafaya/Arabic-BERT
 https://github.com/mohataher/arabic_big_corpus
 https://github.com/aosaimy/riyadh-corpus-collection
 https://github.com/anastaw/Arabic-Wikipedia-Corpus/blob/master/Wikipedia-Corpus-30-08-10.sql.gz
 https://github.com/antcorpus/RSSCrawlerArabicCorpus
 BBC Crawl: https://github.com/motazsaad/bbc-crawler
Newshttps://github.com/parallelfold/SaudiNewsNet
 https://github.com/motazsaad/Arabic-News
ArabicWeb16Labeled Dataset: https://sites.google.com/view/arabicweb16/download/labelled-datasets?authuser=0
 Sample big: https://sites.google.com/view/arabicweb16/getting-started?authuser=0 https://drive.google.com/drive/folders/0B6P2zR7VKiV4SWdITFlXcmxObWM
Rawhttps://github.com/Islamicate-DH/arabicCorpus
NERhttps://github.com/EmnamoR/Arabic-named-entity-recognition
 https://github.com/RamziSalah/Classical-Arabic-Named-Entity-Recognition-Corpus
 https://www.cs.cmu.edu/~ark/ArabicNER/
 https://github.com/juand-r/entity-recognition-datasets
 https://github.com/oudalab/Arabic-NER
 https://github.com/EmnamoR/Arabic-named-entity-recognition/tree/master/
Keyphrasehttps://github.com/ailab-uniud/akec
Sentiment Analysishttps://github.com/nora-twairesh/AraSenti
 https://github.com/almoslmi/masc
 https://github.com/marwanalomari/Sentiment-Classifier-Logistic-Regression-for-Arabic-Services-Reviews-in-Lebanon
 https://github.com/komari6/Arabic-twitter-corpus-AJGT
 https://tahatobaili.github.io/project-rbz/
Speech datahttp://www.cs.stir.ac.uk/~lss/arabic/
 https://github.com/Anwarvic/Arabic-Speech-Recognition
Tashkeelhttps://github.com/Anwarvic/Tashkeela-Model
Speech to texthttps://github.com/motazsaad/jsc-news-broadcast
Misspellingshttps://github.com/linuxscout/aghlat
Arabic/English Translationhttps://github.com/meedan/news-memory
Poetryhttps://github.com/d7eame/Matn
WordEmbeddingshttp://mazajak.inf.ed.ac.uk:8000/
Storieshttps://github.com/motazsaad/Arabic-Stories-Corpus
Dialectshttps://github.com/motazsaad/corpus2json/tree/master/corpora/nizar_arabic_dialects
Opinion Mininghttps://github.com/AhmedObaidi/omcca
POS + relhttps://github.com/salsama/Arabic-Information-Extraction-Corpus
 https://github.com/qcri/dialectal_arabic_pos_tagger
 https://github.com/seloufian/Arabic-PoS-Tagger
Annotated per nationalityhttps://github.com/Data-Science-for-Linguists-2020/Arabic-Learner-Corpus-Considerations
OCRDigits: https://www.kaggle.com/mloey1/ahdd1
 Letters: https://www.kaggle.com/mloey1/ahcd1
 https://cactus.orange-labs.fr/ALIF/download.html
 http://www.ccse.kfupm.edu.sa/~husni/ArabicOCR/PATS-A02.htm
 http://kafd.ideas2serve.net/KAFDDownloadOptions.php
 https://github.com/ainawind27/arabicocr-data
 http://kitab-project.org/
 https://medium.com/@openiti/openiti-aocp-9802865a6586
 https://www.rdi-sotoor.com/#/login
 https://www.primaresearch.org/RASM2019/
 https://blogs.bl.uk/digital-scholarship/2018/02/8th-century-arabic-scientists-meet-todays-computer-scientists.html
 https://blogs.bl.uk/digital-scholarship/2018/03/arabic-handwrittten-ocr.html
 Raw images: https://fromthepage.com/bldigital/arabic-scientific-manuscripts
Arabic conversation for chatbotshttps://www.kaggle.com/ahmedkaramdev/arabic-conversational-dataset
WordNethttp://compling.hss.ntu.edu.sg/omw/
 Resources: http://globalwordnet.org/resources/arabic-wordnet/arabic-resources/
 http://compling.hss.ntu.edu.sg/omw/wns/arb/LICENSE
Otherhttps://www.al-fanarmedia.org/2018/11/an-online-arabic-dictionary-makes-its-debut/#.W__YlAjqMNw.twitter
 Treebank: https://sourceforge.net/projects/arabicsubcats/files/