- Sentiment Analysis with KNIME - Nov 29, 2021.
Check out this tutorial on how to approach sentiment classification with supervised machine learning algorithms.
Knime, NLP, Sentiment Analysis, Text Analytics
- How to fast-track machine translation projects - Nov 16, 2021.
Data is the lifeblood of any successful machine learning model, and machine translation models are no exception. Without relevant and properly labelled data, even the most sophisticated model will be unable to achieve reliable results.
Machine Translation, NLP, Text Analytics
- Simple Text Scraping, Parsing, and Processing with this Python Library - Oct 29, 2021.
Scraping, parsing, and processing text data from the web can be difficult. But it can also be easy, using Newspaper3k.
Data Processing, NLP, Python, Text Analytics, Web Scraping
- 15 Must-Know Python String Methods - Sep 21, 2021.
It is not always about numbers.
Data Processing, NLP, Python, Text Analytics
- Text Preprocessing Methods for Deep Learning - Sep 10, 2021.
While the preprocessing pipeline we are focusing on in this post is mainly centered around Deep Learning, most of it will also be applicable to conventional machine learning models too.
Data Preprocessing, Data Processing, Deep Learning, NLP, Text Analytics
- Semantic Search: Measuring Meaning From Jaccard to Bert - Jul 2, 2021.
In this article, we’ll cover a few of the most interesting — and powerful — of these techniques — focusing specifically on semantic search. We’ll learn how they work, what they’re good at, and how we can implement them ourselves.
BERT, NLP, Search, Similarity, Text Analytics
- How to Train a Joint Entities and Relation Extraction Classifier using BERT Transformer with spaCy 3 - Jun 28, 2021.
A step-by-step guide on how to train a relation extraction classifier using Transformer and spaCy3.
BERT, NLP, Python, spaCy, Text Analytics, Transformer
- The Word “WORD” Has 13 Meanings - Jun 22, 2021.
Thoughts around Knowledge Graphs, the semantic nature of language, and the two main types of word ambiguity.
Expert.ai, Knowledge Graph, NLP, Text Analytics
- A Graph-based Text Similarity Method with Named Entity Information in NLP - Jun 16, 2021.
In this article, the author summarizes the 2017 paper "A Graph-based Text Similarity Measure That Employs Named Entity Information" as per their understanding. Better understand the concepts by reading along.
Graphs, NLP, Similarity, Text Analytics
- Topic Modeling with Streamlit - May 26, 2021.
What does it take to create and deploy a topic modeling web application quickly? Read this post to see how the author uses Python NLP packages for topic modeling, Streamlit for the web application framework, and Streamlit Sharing for deployment.
Deployment, NLP, Python, spaCy, Streamlit, Text Analytics, Topic Modeling
- Machine Translation in a Nutshell - May 17, 2021.
Marketing scientist Kevin Gray asks Dr. Anna Farzindar of the University of Southern California for a snapshot of machine translation. Dr. Farzindar also provided the original art for this article.
Machine Translation, Neural Networks, NLP, Text Analytics
- How to Apply Transformers to Any Length of Text - Apr 12, 2021.
Read on to find how to restore the power of NLP for long sequences.
BERT, NLP, Python, Text Analytics, Transformer
- Automated Text Classification with EvalML - Apr 6, 2021.
Learn how EvalML leverages Woodwork, Featuretools and the nlp-primitives library to process text data and create a machine learning model that can detect spam text messages.
Automated Machine Learning, AutoML, NLP, Python, Text Analytics, Text Classification
- How to Begin Your NLP Journey - Mar 17, 2021.
In this blog post, learn how to process text using Python.
NLP, Python, Text Analytics
- Natural Language Processing Pipelines, Explained - Mar 16, 2021.
This article presents a beginner's view of NLP, as well as an explanation of how a typical NLP pipeline might look.
Explained, NLP, NLTK, Python, Text Analytics
- Getting Started with 5 Essential Natural Language Processing Libraries - Feb 3, 2021.
This article is an overview of how to get started with 5 popular Python NLP libraries, from those for linguistic data visualization, to data preprocessing, to multi-task functionality, to state of the art language modeling, and beyond.
Data Preparation, Data Preprocessing, Data Visualization, Hugging Face, NLP, Python, spaCy, Text Analytics, Transformer
- How to Clean Text Data at the Command Line - Dec 16, 2020.
A basic tutorial about cleaning data using command-line tools: tr, grep, sort, uniq, sort, awk, sed, and csvlook.
Data Preprocessing, Data Processing, NLP, Text Analytics
- Optimizing the Levenshtein Distance for Measuring Text Similarity - Oct 16, 2020.
For speeding up the calculation of the Levenshtein distance, this tutorial works on calculating using a vector rather than a matrix, which saves a lot of time. We’ll be coding in Java for this implementation.
Java, NLP, Text Analytics
- Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Semantics and Pragmatics - Aug 31, 2020.
Algorithms for text analytics must model how language works to incorporate meaning in language—and so do the people deploying these algorithms. Bender & Lascarides 2019 is an accessible overview of what the field of linguistics can teach NLP about how meaning is encoded in human languages.
ebook, NLP, Text Analytics, Text Mining
- The NLP Model Forge: Generate Model Code On Demand - Aug 24, 2020.
You've seen their Big Bad NLP Database and The Super Duper NLP Repo. Now Quantum Stat is back with its most ambitious NLP product yet: The NLP Model Forge.
Google Colab, Modeling, NLP, Text Analytics
- Simple Question Answering (QA) Systems That Use Text Similarity Detection in Python - Apr 7, 2020.
How exactly are smart algorithms able to engage and communicate with us like humans? The answer lies in Question Answering systems that are built on a foundation of Machine Learning and Natural Language Processing. Let's build one here.
NLP, Python, Question answering, Similarity, Text Analytics
- Why you should NOT use MS MARCO to evaluate semantic search - Apr 2, 2020.
If we want to investigate the power and limitations of semantic vectors (pre-trained or not), we should ideally prioritize datasets that are less biased towards term-matching signals. This piece shows that the MS MARCO dataset is more biased towards those signals than we expected and that the same issues are likely present in many other datasets due to similar data collection designs.
Data Science, Metrics, NLP, Text Analytics
- Alternative Data, Text Analytics, and Sentiment Analysis in Trading and Investing - Mar 25, 2020.
Different types of data beyond your typical dollars and cents have been used in the finance industry for many years. By leveraging machine learning, sentiment data is expected to play an increasingly dominant role in the investment industry, and this article highlights some special challenges of its use in trading models.
Investment, Sentiment Analysis, Text Analytics
- How To Build Your Own Feedback Analysis Solution - Mar 12, 2020.
Automating the analysis of customer feedback will sound like a great idea after reading a couple hundred reviews. Building an NLP solution to provide in-depth analysis of what your customers are thinking is a serious undertaking, and this guide helps you scope out the entire project.
Customer Analytics, NLP, Text Analytics
- Tokenization and Text Data Preparation with TensorFlow & Keras - Mar 6, 2020.
This article will look at tokenizing and further preparing text data for feeding into a neural network using TensorFlow and Keras preprocessing tools.
Data Preprocessing, Keras, NLP, Python, TensorFlow, Text Analytics, Tokenization
- Automatic Text Summarization in a Nutshell - Dec 18, 2019.
Marketing scientist Kevin Gray asks Dr. Anna Farzindar of the University of Southern California about Automatic Text Summarization and the various ways it is used.
NLP, Text Analytics, Text Summarization
- Markov Chains: How to Train Text Generation to Write Like George R. R. Martin - Nov 29, 2019.
Read this article on training Markov chains to generate George R. R. Martin style text.
Generative Models, Markov Chains, NLP, Text Analytics
- Text Encoding: A Review - Nov 22, 2019.
We will focus here exactly on that part of the analysis that transforms words into numbers and texts into number vectors: text encoding.
Data Preprocessing, NLP, Representation, Rosaria Silipo, Text Analytics, Word Embeddings
- Lemma, Lemma, Red Pyjama: Or, doing words with AI - Oct 10, 2019.
If we want a machine learning model to be able to generalize these forms together, we need to map them to a shared representation. But when are two different words the same for our purposes? It depends.
AI, NLP, Text Analytics
- An Overview of Topics Extraction in Python with Latent Dirichlet Allocation - Sep 4, 2019.
A recurring subject in NLP is to understand large corpus of texts through topics extraction. Whether you analyze users’ online reviews, products’ descriptions, or text entered in search bars, understanding key topics will always come in handy.
LDA, NLP, Python, Text Analytics, Topic Modeling
- When Too Likely Human Means Not Human: Detecting Automatically Generated Text - May 23, 2019.
Passably-human automated text generation is a reality. How do we best go about detecting it? As it turns out, being too predictably human may actually be a reasonably good indicator of not being human at all.
Generative Models, NLP, Text Analytics
- A Complete Exploratory Data Analysis and Visualization for Text Data: Combine Visualization and NLP to Generate Insights - May 9, 2019.
Visually representing the content of a text document is one of the most important tasks in the field of text mining as a Data Scientist or NLP specialist. However, there are some gaps between visualizing unstructured (text) data and structured data.
Pages: 1 2
Data Visualization, NLP, Plotly, Python, Text Analytics
- How to solve 90% of NLP problems: a step-by-step guide - Jan 14, 2019.
Read this insightful, step-by-step article on how to use machine learning to understand and leverage text.
LIME, NLP, Text Analytics, Text Classification, Word Embeddings, word2vec
- Comparison of the Text Distance Metrics - Jan 7, 2019.
There are many different approaches of how to compare two texts (strings of characters). Each has its own advantages and disadvantages and is good only for a range of specific use cases.
Metrics, NLP, Text Analytics
- Machine Learning for Text Classification Using SpaCy in Python - Sep 11, 2018.
In this post, we will demonstrate how text classification can be implemented using spaCy without having any deep learning experience.
NLP, Python, Text Analytics, Text Classification, Text Mining
- Topic Modeling with LSA, PLSA, LDA & lda2Vec - Aug 30, 2018.
This article is a comprehensive overview of Topic Modeling and its associated techniques.
LDA, NLP, Text Analytics, Topic Modeling
- Word Vectors in Natural Language Processing: Global Vectors (GloVe) - Aug 29, 2018.
A well-known model that learns vectors or words from their co-occurrence information is GlobalVectors (GloVe). While word2vec is a predictive model — a feed-forward neural network that learns vectors to improve the predictive ability, GloVe is a count-based model.
NLP, Sciforce, Text Analytics, word2vec
- Emotion and Sentiment Analysis: A Practitioner’s Guide to NLP - Aug 24, 2018.
Sentiment analysis is widely used, especially as a part of social media analysis for any domain, be it a business, a recent movie, or a product launch, to understand its reception by the people and what they think of it based on their opinions or, you guessed it, sentiment!
NLP, Text Analytics, Workflow
- Comparison of the Most Useful Text Processing APIs - Aug 23, 2018.
There is a need to compare different APIs to understand key pros and cons they have and when it is better to use one API instead of the other. Let us proceed with the comparison.
NLP, Text Analytics, Text Mining
- Named Entity Recognition: A Practitioner’s Guide to NLP - Aug 17, 2018.
Named entity recognition (NER) , also known as entity chunking/extraction , is a popular technique used in information extraction to identify and segment the named entities and classify or categorize them under various predefined classes.
NLP, Text Analytics, Workflow
- Affordable online news archives for academic research - Aug 10, 2018.
Many researchers need access to multi-year historical repositories of online news articles. We identified three companies that make such access affordable, and spoke with their CEOs.
API, Research, Text Analytics, Text Mining, Webhose
- Understanding Language Syntax and Structure: A Practitioner’s Guide to NLP - Aug 10, 2018.
Knowledge about the structure and syntax of language is helpful in many areas like text processing, annotation, and parsing for further operations such as text classification or summarization.
NLP, Text Analytics, Workflow
- Text Wrangling & Pre-processing: A Practitioner’s Guide to NLP - Aug 3, 2018.
I will highlight some of the most important steps which are used heavily in Natural Language Processing (NLP) pipelines and I frequently use them in my NLP projects.
Data Preprocessing, Data Wrangling, NLP, Text Analytics, Workflow
- WTF is TF-IDF? - Aug 2, 2018.
Relevant words are not necessarily the most frequent words since stopwords like “the”, “of” or “a” tend to occur very often in many documents.
Information Retrieval, Python, Text Analytics, Text Mining, TF-IDF
- Data Retrieval with Web Scraping: A Practitioner’s Guide to NLP - Jul 26, 2018.
Proven and tested hands-on strategies to tackle NLP tasks.
Data Preprocessing, NLP, Text Analytics, Workflow
- Getting Started with spaCy for Natural Language Processing - May 2, 2018.
spaCy is a Python natural language processing library specifically designed with the goal of being a useful library for implementing production-ready systems. It is particularly fast and intuitive, making it a top contender for NLP tasks.
Data Preparation, Data Preprocessing, NLP, Python, Text Analytics, Text Mining
- 50+ Useful Machine Learning & Prediction APIs, 2018 Edition - May 1, 2018.
Extensive list of 50+ APIs in Face and Image Recognition ,Text Analysis, NLP, Sentiment Analysis, Language Translation, Machine Learning and prediction.
API, Face Recognition, Image Recognition, Machine Learning, Natural Language Processing, Sentiment Analysis, Text Analytics
- Python Regular Expressions Cheat Sheet - Apr 19, 2018.
The tough thing about learning data is remembering all the syntax. While at Dataquest we advocate getting used to consulting the Python documentation, sometimes it's nice to have a handy reference, so we've put together this cheat sheet to help you out!
Cheat Sheet, Programming, Python, Text Analytics
- Top 20 Deep Learning Papers, 2018 Edition - Apr 3, 2018.
Deep Learning is constantly evolving at a fast pace. New techniques, tools and implementations are changing the field of Machine Learning and bringing excellent results.
Algorithms, Deep Learning, Machine Learning, Neural Networks, TensorFlow, Text Analytics, Trends
- Text Data Preprocessing: A Walkthrough in Python - Mar 26, 2018.
This post will serve as a practical walkthrough of a text data preprocessing task using some common Python tools.
Data Preparation, Data Preprocessing, NLP, Python, Text Analytics, Text Mining
- Text Processing in R - Mar 9, 2018.
There are good reasons to want to use R for text processing, namely that we can do it, and that we can fit it in with the rest of our analyses. Furthermore, there is a lot of very active development going on in the R text analysis community right now.
Data Processing, R, Text Analytics, Text Mining
- A General Approach to Preprocessing Text Data - Dec 1, 2017.
Recently we had a look at a framework for textual data science tasks in their totality. Now we focus on putting together a generalized approach to attacking text data preprocessing, regardless of the specific textual data science task you have in mind.
Data Preparation, Data Preprocessing, NLP, Text Analytics, Text Mining, Tokenization
- A Framework for Approaching Textual Data Science Tasks - Nov 22, 2017.
Although NLP and text mining are not the same thing, they are closely related, deal with the same raw data type, and have some crossover in their uses. Let's discuss the steps in approaching these types of tasks.
Modeling, Natural Language Processing, NLP, Text Analytics, Text Mining
- Text Clustering : Quick insights from Unstructured Data, part 2 - Jul 4, 2017.
We will build this in a modular way and also focus on exposing the functionalities as an API so that it can serve as a plug and play model without any disruptions to the existing systems.
API, Clustering, Python, Text Analytics, Unstructured data
- Text Clustering: Get quick insights from Unstructured Data - Jun 28, 2017.
Grouping and clustering free text is an important advance towards making good use of it. We present an algorithm for unsupervised text clustering approach that enables business to programmatically bin this data.
Clustering, Text Analytics, Unstructured data
- Text Mining 101: Mining Information From A Resume - May 24, 2017.
We show a framework for mining relevant entities from a text resume, and how to separation parsing logic from entity specification.
Career, Natural Language Processing, NLP, Resume, Text Analytics, Text Mining
- Machine Learning Finds “Fake News” with 88% Accuracy - Apr 12, 2017.
In this post, the author assembles a dataset of fake and real news and employs a Naive Bayes classifier in order to create a model to classify an article as fake or real based on its words and phrases.
Data Science, Fake News, Machine Learning, Naive Bayes, Politics, Text Analytics
- Text Analytics: A Primer - Mar 14, 2017.
Marketing scientist Kevin Gray asks Professor Bing Liu to give us a quick snapshot of text analytics in this informative interview.
Bing Liu, Natural Language Processing, NLP, Text Analytics, Text Mining
- Provalis Research Releases an Enhanced Qualitative Data Analysis Freeware - Feb 3, 2017.
Upgraded version of the qualitative analysis freeware QDA Miner Lite now includes a document overview, tree-grid display, image rotation and resizing, importing from PowerPoint and more.
Provalis, Qualitative Analytics, Text Analytics
- Text Mining Amazon Mobile Phone Reviews: Interesting Insights - Jan 10, 2017.
We analyzed more than 400 thousand reviews of unlocked mobile phones sold on Amazon.com to find out insights with respect to reviews, ratings, price and their relationships.
Amazon, Analytics, Product reviews, Sentiment Analysis, Text Analytics, Text Mining
- Social Media for Marketing and Healthcare: Focus on Adverse Side Effects - Jan 9, 2017.
Social media like twitter, facebook are very important sources of big data on the internet and using text mining, valuable insights about a product or service can be found to help marketing teams. Lets see, how healthcare companies are using big data and text mining to improve their marketing strategies.
Healthcare, NLP, Social Media, Text Analytics, Text Mining, Twitter
- What Data Scientists Can Learn From Qualitative Research - Jul 14, 2016.
Learn what data scientists can learn from qualitative researchers when it comes to analysing text, and how this relates to writing quality code.
Programming, Qualitative Analytics, Qualitative Research, Text Analytics
- Elementary, My Dear Watson! An Introduction to Text Analytics via Sherlock Holmes - Feb 12, 2016.
Want to learn about the field of text mining, go on an adventure with Sherlock & Watson. Here you will find what are different sub-domains of text mining along with a practical example.
Dato, NLP, Sherlock Holmes, Text Analytics
- Everything You Need to Know about Natural Language Processing - Dec 21, 2015.
Natural language processing (NLP) helps computers understand human speech and language. We define the key NLP concepts and explain how it fits in the bigger picture of Artificial Intelligence.
API, Buzzlogix, NLP, Text Analytics, Text Mining
- 11 things to know about Sentiment Analysis - Aug 13, 2015.
Seth Grimes, a text analytics guru, shares 11 key observations on what works, what is past, what is coming, and what to keep in mind while doing sentiment analysis.
Affective Computing, Emoji, Sentiment Analysis, Text Analytics
- Algorithmia Tested: Human vs Automated Tag Generation - Apr 21, 2015.
Algorithmia, the marketplace for algorithms, can be a platform for hosting APIs to do a plethora of text analytics and information retrieval tasks. Automatic post tagging is done in this case study to demonstrate the effectiveness and ease-of-use of the platform.
Pages: 1 2
Algorithmia, API, Grant Marshall, Information Retrieval, Python, Text Analytics
- Text Analysis 101: Document Classification - Jan 24, 2015.
Document classification is an example of Machine Learning (ML) in the form of Natural Language Processing (NLP). By classifying text, we are aiming to assign one or more classes or categories to a document, making it easier to manage and sort.
Document Classification, Parsa Ghaffari, Text Analytics, Text Classification