How to perform fuzzy logic string matching in nlp

This recipe helps you perform fuzzy logic string matching in nlp
Last Updated: 20 May 2022

Get access to Data Science projects View all Data Science projects

MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET ALL TAGS

Recipe Objective

How to perform fuzzy logic string matching ?

fuzzy logic is the simplest method in case of string matching or we can say comparing the string. The library used in this is called fuzzywuzzy library where we can have a score out of 100 which will denote the two strings are equal by giving similarity index.It is process of finding strings that matches given pattern.Levenshtein distance is used in it for calculation of the difference between the sequences.

Lets understand with practical implementation

Recipe Objective

Step 1 - Import the necessary libraries

from fuzzywuzzy import fuzz from fuzzywuzzy import process

Step 2 - Lets try Simple ratio usage

print("Lets see the ratio for string matching:",fuzz.ratio('Fuzzy for String Matching', 'Fuzzy String Matching'), '\n') print("Lets see the ratio for string matching:",fuzz.ratio('This is an NLP Session', 'This is an NLP Session'),'\n') print("Lets see the ratio for string matching:",fuzz.ratio('your learning fuzzywuzzy', 'Your Learning FuzzyWuzzy'))

Lets see the ration for string matching: 91

Lets see the ration for string matching: 100

Lets see the ration for string matching: 83

From the above we can say that,

the first string match score 91/100 because one of the word is missing in the second string i.e for,

the second string match score is 100/100 because both the strings are same or matching exactly with each other.

the third string match score is 83/100 because in first string all the characters are in lower case but in the second string some of the characters are in upper case and some are in lower case.

Step 3 - Now we will try with partial ratio

print("Lets see the partial ratio for string matching:",fuzz.partial_ratio('Jon is eating', 'Jon is eating !'), '\n') print("Lets see the partial ratio for string matching:",fuzz.partial_ratio('Mark is walking on streets', 'Mark walking streets'),'\n')

Lets see the partial ratio for string matching: 100

Lets see the partial ratio for string matching: 80

From the above we understand about partial_ration using FuzzyWuzzy library,

The first Sentence partial_ration score is 100/100 because as there is a Exclamation mark in the second string, but still partially words are same so score comes 100.

The second sentence score is 80/100, score is less because there is a extra token present in the first string.

Step 4 - Token set ratio and token sort ratio

print("Lets see the token sort ratio for string matching:",fuzz.token_sort_ratio("for every one", "every one for"), '\n') print("Lets see the token set ratio for string matching:",fuzz.token_set_ratio("This is done", "This is done done"))

Lets see the token sort ratio for string matching: 100

Lets see the token set ratio for string matching: 100

Ratio comes 100/100 in both cases because,

token sort ratio This gives 100 as every word is same, irrespective of the position. Position not matters when words are same.

token set ratio it considers duplicate words as a single word.

Step 5 - WRatio with Example

print("Lets see the Wratio for string matching:",fuzz.WRatio("This is good", "This is good"), '\n') print("Lets see the Wratio for string matching:",fuzz.WRatio("Sometimes good is bad!!!", "Sometimes good is bad"))

Lets see the Wratio for string matching: 100

Lets see the Wratio for string matching: 100

sometimes its better to use WRatio instead of simple ratio as WRatio handles lower and upper cases and some other parameters too.

What Users are saying..

Ed Godalle

Director Data Analytics at EY / EY Tech

I am the Director of Data Analytics with over 10+ years of IT experience. I have a background in SQL, Python, and Big Data working with Accenture, IBM, and Infosys. I am looking to enhance my skills... Read More

Relevant Projects

Machine Learning Projects

Data Science Projects

Python Projects for Data Science

Data Science Projects in R

Machine Learning Projects for Beginners

Deep Learning Projects

Neural Network Projects

Tensorflow Projects

NLP Projects

Kaggle Projects

IoT Projects

Big Data Projects

Hadoop Real-Time Projects Examples

Spark Projects

Data Analytics Projects for Students

Relevant Projects

Autogen Project to Build an Intelligent AI Personal Assistant

Build a multi-agent AI personal assistant using Autogen that can handle tasks like managing calendars, emails, reminders, messaging, research, and weather updates, automating everyday workflows with LLMs and tool integrations. This is an upcoming project that is expected to be launched in June.

View Project Details

AWS Project to Build and Deploy LSTM Model with Sagemaker

In this AWS Sagemaker Project, you will learn to build a LSTM model on Sagemaker for sales forecasting while analyzing the impact of weather conditions on Sales.

View Project Details

Build Real Estate Price Prediction Model with NLP and FastAPI

In this Real Estate Price Prediction Project, you will learn to build a real estate price prediction machine learning model and deploy it on Heroku using FastAPI Framework.

View Project Details

Recommender System Machine Learning Project for Beginners-4

Collaborative Filtering Recommender System Project - Comparison of different model based and memory based methods to build recommendation system using collaborative filtering.

View Project Details

Learn to Build an End-to-End Machine Learning Pipeline - Part 3

This machine learning project integrates model monitoring, CI/CD practices and Amazon Sagemaker pipelines into the logistics-oriented machine learning pipeline to streamline workflow orchestration for scalable and reliable deployment of ML models in logistics.

View Project Details

How to perform fuzzy logic string matching in nlp

Recipe Objective

Table of Contents

Step 1 - Import the necessary libraries

Step 2 - Lets try Simple ratio usage

Step 3 - Now we will try with partial ratio

Step 4 - Token set ratio and token sort ratio

Step 5 - WRatio with Example

What Users are saying..

Ed Godalle

Relevant Projects

You might also like

Relevant Projects