- Text Mining with R: The Free eBook - Oct 15, 2020.
This freely-available book will show you how to perform text analytics in R, using packages from the tidyverse.
Free ebook, R, Text Mining, Tidyverse
- Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Semantics and Pragmatics - Aug 31, 2020.
Algorithms for text analytics must model how language works to incorporate meaning in language—and so do the people deploying these algorithms. Bender & Lascarides 2019 is an accessible overview of what the field of linguistics can teach NLP about how meaning is encoded in human languages.
ebook, NLP, Text Analytics, Text Mining
- Text Mining in Python: Steps and Examples - May 12, 2020.
The majority of data exists in the textual form which is a highly unstructured format. In order to produce meaningful insights from the text data then we need to follow a method called Text Analysis.
NLP, Python, Text Mining
- The Big Bad NLP Database: Access Nearly 300 Datasets - Feb 28, 2020.
Check out this database of nearly 300 freely-accessible NLP datasets, curated from around the internet.
Datasets, NLP, Text Mining
- All you need to know about text preprocessing for NLP and Machine Learning - Apr 9, 2019.
We present a comprehensive introduction to text preprocessing, covering the different techniques including stemming, lemmatization, noise removal, normalization, with examples and explanations into when you should use each of them.
Data Preprocessing, Machine Learning, NLP, Python, Text Analysis, Text Mining
- Towards Automatic Text Summarization: Extractive Methods - Mar 13, 2019.
The basic idea looks simple: find the gist, cut off all opinions and detail, and write a couple of perfect sentences, the task inevitably ended up in toil and turmoil. Here is a short overview of traditional approaches that have beaten a path to advanced deep learning techniques.
Bayesian, Deep Learning, Machine Learning, Sciforce, Text Analysis, Text Mining, Topic Modeling
- Text Preprocessing in Python: Steps, Tools, and Examples - Nov 6, 2018.
We outline the basic steps of text preprocessing, which are needed for transferring text from human language to machine-readable format for further processing. We will also discuss text preprocessing tools.
Pages: 1 2
Data Preparation, NLP, Python, Text Analysis, Text Mining, Tokenization
- Labeling Unstructured Text for Meaning to Achieve Predictive Lift - Oct 31, 2018.
In this post, we examine several advance NLP techniques, including: labeling nouns and noun phrases for meaning, labeling (most often) adverbs and adjectives for sentiment, and labeling verbs for intent.
NLP, Overfitting, Text Mining, Unstructured data
- Named Entity Recognition and Classification with Scikit-Learn - Oct 25, 2018.
Named Entity Recognition and Classification is a process of recognizing information units like names, including person, organization and location names, and numeric expressions from unstructured text. The goal is to develop practical and domain-independent techniques in order to detect named entities with high accuracy automatically.
Pages: 1 2
NLP, Text Classification, Text Mining
- Machine Learning for Text Classification Using SpaCy in Python - Sep 11, 2018.
In this post, we will demonstrate how text classification can be implemented using spaCy without having any deep learning experience.
NLP, Python, Text Analytics, Text Classification, Text Mining
- Multi-Class Text Classification with Scikit-Learn - Aug 27, 2018.
The vast majority of text classification articles and tutorials on the internet are binary text classification such as email spam filtering and sentiment analysis. Real world problem are much more complicated than that.
NLP, Python, scikit-learn, Text Classification, Text Mining
- Comparison of the Most Useful Text Processing APIs - Aug 23, 2018.
There is a need to compare different APIs to understand key pros and cons they have and when it is better to use one API instead of the other. Let us proceed with the comparison.
NLP, Text Analytics, Text Mining
- Affordable online news archives for academic research - Aug 10, 2018.
Many researchers need access to multi-year historical repositories of online news articles. We identified three companies that make such access affordable, and spoke with their CEOs.
API, Research, Text Analytics, Text Mining, Webhose
- WTF is TF-IDF? - Aug 2, 2018.
Relevant words are not necessarily the most frequent words since stopwords like “the”, “of” or “a” tend to occur very often in many documents.
Information Retrieval, Python, Text Analytics, Text Mining, TF-IDF
- Text Mining on the Command Line - Jul 13, 2018.
In this tutorial, I use raw bash commands and regex to process raw and messy JSON file and raw HTML page. The tutorial helps us understand the text processing mechanism under the hood.
Data Preparation, Data Preprocessing, NLP, Text Mining
- Natural Language Processing Nuggets: Getting Started with NLP - Jun 19, 2018.
Check out this collection of NLP resources for beginners, starting from zero and slowly progressing to the point that readers should have an idea of where to go next.
Beginners, Data Preparation, NLP, Text Mining
- Getting Started with spaCy for Natural Language Processing - May 2, 2018.
spaCy is a Python natural language processing library specifically designed with the goal of being a useful library for implementing production-ready systems. It is particularly fast and intuitive, making it a top contender for NLP tasks.
Data Preparation, Data Preprocessing, NLP, Python, Text Analytics, Text Mining
- Implementing Deep Learning Methods and Feature Engineering for Text Data: The GloVe Model - Apr 25, 2018.
The GloVe model stands for Global Vectors which is an unsupervised learning model which can be used to obtain dense word vectors similar to Word2Vec.
Deep Learning, Feature Engineering, NLP, Python, Text Mining
- Implementing Deep Learning Methods and Feature Engineering for Text Data: The Skip-gram Model - Apr 10, 2018.
Just like we discussed in the CBOW model, we need to model this Skip-gram architecture now as a deep learning classification model such that we take in the target word as our input and try to predict the context words.
Deep Learning, Feature Engineering, NLP, Python, Text Mining, Word Embeddings
- Machine Learning for Text - Apr 9, 2018.
This book covers machine learning techniques from text using both bag-of-words and sequence-centric methods. The scope of coverage is vast, and it includes traditional information retrieval methods and also recent methods from neural networks and deep learning.
Book, Charu Aggarwal, Information Retrieval, Machine Learning, Text Mining
- Understanding Feature Engineering: Deep Learning Methods for Text Data - Mar 28, 2018.
Newer, advanced strategies for taming unstructured, textual data: In this article, we will be looking at more advanced feature engineering strategies which often leverage deep learning models.
Deep Learning, Feature Engineering, NLP, Python, Text Mining
- Text Data Preprocessing: A Walkthrough in Python - Mar 26, 2018.
This post will serve as a practical walkthrough of a text data preprocessing task using some common Python tools.
Data Preparation, Data Preprocessing, NLP, Python, Text Analytics, Text Mining
- Text Processing in R - Mar 9, 2018.
There are good reasons to want to use R for text processing, namely that we can do it, and that we can fit it in with the rest of our analyses. Furthermore, there is a lot of very active development going on in the R text analysis community right now.
Data Processing, R, Text Analytics, Text Mining
- Training and Visualising Word Vectors - Jan 23, 2018.
In this tutorial I want to show how you can implement a skip gram model in tensorflow to generate word vectors for any text you are working with and then use tensorboard to visualize them.
Natural Language Processing, Text Mining, Visualization, word2vec
- Elasticsearch for Dummies - Jan 12, 2018.
In this blog, you’ll get to know the basics of Elasticsearch, its advantages, how to install it and indexing the documents using Elasticsearch.
Elasticsearch, NLP, Text Mining
- A General Approach to Preprocessing Text Data - Dec 1, 2017.
Recently we had a look at a framework for textual data science tasks in their totality. Now we focus on putting together a generalized approach to attacking text data preprocessing, regardless of the specific textual data science task you have in mind.
Data Preparation, Data Preprocessing, NLP, Text Analytics, Text Mining, Tokenization
- Building a Wikipedia Text Corpus for Natural Language Processing - Nov 23, 2017.
Wikipedia is a rich source of well-organized textual data, and a vast collection of knowledge. What we will do here is build a corpus from the set of English Wikipedia articles, which is freely and conveniently available online.
Datasets, Natural Language Processing, NLP, Text Mining, Wikidata, Wikipedia
- A Framework for Approaching Textual Data Science Tasks - Nov 22, 2017.
Although NLP and text mining are not the same thing, they are closely related, deal with the same raw data type, and have some crossover in their uses. Let's discuss the steps in approaching these types of tasks.
Modeling, Natural Language Processing, NLP, Text Analytics, Text Mining
- Tips for Getting Started with Text Mining in R and Python - Nov 8, 2017.
This article opens up the world of text mining in a simple and intuitive way and provides great tips to get started with text mining.
Python, R, Text Mining
- Top 10 Machine Learning with R Videos - Oct 24, 2017.
A complete video guide to Machine Learning in R! This great compilation of tutorials and lectures is an amazing recipe to start developing your own Machine Learning projects.
Algorithms, Clustering, K-nearest neighbors, Machine Learning, PCA, R, Text Mining, Top 10, Youtube
- Search Millions of Documents for Thousands of Keywords in a Flash - Sep 1, 2017.
We present a python library called FlashText that can search or replace keywords / synonyms in documents in O(n) – linear time.
Algorithms, Data Science, GitHub, NLP, Python, Search, Search Engine, Text Mining
- Text Mining 101: Mining Information From A Resume - May 24, 2017.
We show a framework for mining relevant entities from a text resume, and how to separation parsing logic from entity specification.
Career, Natural Language Processing, NLP, Resume, Text Analytics, Text Mining
- Using Deep Learning To Extract Knowledge From Job Descriptions - May 9, 2017.
We present a deep learning approach to extract knowledge from a large amount of data from the recruitment space. A learning to rank approach is followed to train a convolutional neural network to generate job title and job description embeddings.
Convolutional Neural Networks, Deep Learning, Natural Language Processing, Neural Networks, NLP, Text Mining
- Text Analytics: A Primer - Mar 14, 2017.
Marketing scientist Kevin Gray asks Professor Bing Liu to give us a quick snapshot of text analytics in this informative interview.
Bing Liu, Natural Language Processing, NLP, Text Analytics, Text Mining
- Text Mining Amazon Mobile Phone Reviews: Interesting Insights - Jan 10, 2017.
We analyzed more than 400 thousand reviews of unlocked mobile phones sold on Amazon.com to find out insights with respect to reviews, ratings, price and their relationships.
Amazon, Analytics, Product reviews, Sentiment Analysis, Text Analytics, Text Mining
- Social Media for Marketing and Healthcare: Focus on Adverse Side Effects - Jan 9, 2017.
Social media like twitter, facebook are very important sources of big data on the internet and using text mining, valuable insights about a product or service can be found to help marketing teams. Lets see, how healthcare companies are using big data and text mining to improve their marketing strategies.
Healthcare, NLP, Social Media, Text Analytics, Text Mining, Twitter
- The Great Algorithm Tutorial Roundup - Sep 20, 2016.
This is a collection of tutorials relating to the results of the recent KDnuggets algorithms poll. If you are interested in learning or brushing up on the most used algorithms, as per our readers, look here for suggestions on doing so!
Algorithms, Clustering, Decision Trees, K-nearest neighbors, Machine Learning, PCA, Poll, random forests algorithm, Regression, Statistics, Text Mining, Time Series, Visualization
- America’s Next Topic Model - Jul 15, 2016.
Topic modeling is a a great way to get a bird's eye view on a large document collection using machine learning. Here are 3 ways to use open source Python tool Gensim to choose the best topic model.
LDA, NLP, Python, Text Mining, Topic Modeling, Unsupervised Learning
- Mining Twitter Data with Python Part 7: Geolocation and Interactive Maps - Jul 6, 2016.
The final part of this 7 part series explores using geolocation and interactive maps with Twitter data.
Data Visualization, Geo-Localization, Javascript, Python, Social Media, Social Media Analytics, Text Mining, Twitter
- Mining Twitter Data with Python Part 6: Sentiment Analysis Basics - Jul 5, 2016.
Part 6 of this series builds on the previous installments by exploring the basics of sentiment analysis on Twitter data.
Python, Sentiment Analysis, Social Media, Social Media Analytics, Text Mining, Twitter
- Text Mining 101: Topic Modeling - Jul 1, 2016.
We introduce the concept of topic modelling and explain two methods: Latent Dirichlet Allocation and TextRank. The techniques are ingenious in how they work – try them yourself.
LDA, Text Mining, TextRank, Topic Modeling
- Mining Twitter Data with Python Part 5: Data Visualisation Basics - Jun 29, 2016.
Part 5 of this series takes on data visualization, as we look to make sense of our data and highlight interesting insights.
D3.js, Data Visualization, Python, Social Media, Social Media Analytics, Text Mining, Twitter
- Mining Twitter Data with Python Part 4: Rugby and Term Co-occurrences - Jun 27, 2016.
Part 4 of this series employs some of the lessons learned thus far to analyze tweets related to rugby matches and term co-occurrences.
Python, Social Media, Social Media Analytics, Text Mining, Twitter
- A Data Science Approach to Writing a Good GitHub README - May 4, 2016.
Readme is the first file every user will look for, whenever they are checking out the code repository. Learn, what you should write inside your readme files and analyze your existing files effectiveness.
Algorithmia, GitHub, Text Mining
- Everything You Need to Know about Natural Language Processing - Dec 21, 2015.
Natural language processing (NLP) helps computers understand human speech and language. We define the key NLP concepts and explain how it fits in the bigger picture of Artificial Intelligence.
API, Buzzlogix, NLP, Text Analytics, Text Mining
- BABELNET 3.5, Largest Multilingual Dictionary and Semantic Network - Sep 29, 2015.
BabelNet 3.5 covers 272 languages, and offers an improved user interface, new integrated resources of Wikiquote, VerbNet, Microsoft Terminology, GeoNames, WoNeF and ImageNet, and a very large knowledge base with over 380 million semantic relations.
BabelNet, RESTful API, Text Mining, Wikidata
- SentimentBuilder: Visual Analysis of Unstructured Texts - Sep 18, 2015.
Sankey diagrams are mainly used to visualize the flow of data on energy flows, material flow and trade-offs. SentimentBuilder found how to use them with unstructured text in their online NLP tool.
Data Visualization, Sentiment Analysis, Text Mining
- Most Viewed Data Mining Videos on YouTube - May 18, 2015.
The top Data Mining YouTube videos by those like Google and Revolution Analytics covers topics ranging from statistics in data mining to using R for data mining to data mining in sports.
Ayasdi, Data Mining, Google, Grant Marshall, R, Rattle, Revolution Analytics, Statistica, Text Mining, Weka, Youtube
- Fun and Top! US States in 2 Words using twitteR - Feb 19, 2015.
Combining twitteR package with text mining techniques and visualization tools can produce interesting outputs. Find out which US state is fun and top, and which is good and crazy, according to Twitter.
R, Text Mining, Twitter, USA
- Most Viewed Web Mining Lectures - Sep 18, 2014.
Discover interesting lectures on topics like mining information networks and identifying influential members of online communities in this list of the top viewed web mining lectures on videolectures.net.
Text Mining, Videolectures, Web Analytics, Web Mining