![]() |
Sometimes while working with a large corpus of text, we can have a problem in which we try to find which character is acting as a delimiter. This can be an interesting and useful utility while working with a huge amount of data and judging the delimiter. A way to solve this problem is discussed in this article using the Python library of detect_delimiter. InstallationTo install this module type the below command in the terminal.
The first step is to check for all the whitelist characters’ presence in the input text, if found, then those characters are counted for most frequencies and a maximum of one is returned, ignoring all from the blacklist list if provided. If no delimiter is from the whitelist, then characters avoiding blacklist characters are computed for maximum frequency, if found, that character is returned as the delimiter. If still delimiter is not found, default is returned as a delimiter if provided, else None is returned.
Example 1: Working with detect() and default In this, few examples of detecting the delimiters are demonstrated along with the use of default. Python3
Output : ![]() Working with detect() and default Example 2: Using blacklist and whitelist parameters Providing whitelist parameter prioritizes any particular delimiter even if its frequency is less than nonwhitelisted delim. The blacklist parameter can help to ignore any delimiter. Python3
Output : ![]() Examples with blacklist and whitelist Parameters. |
Reffered: https://www.geeksforgeeks.org
Python |
Type: | Geek |
Category: | Coding |
Sub Category: | Tutorial |
Uploaded by: | Admin |
Views: | 10 |