Remove all lines in language X Inițiatorul discuției: Samuel Murray
|
Samuel Murray Ţările de Jos Local time: 16:05 Membru (2006) din engleză în afrikaans + ...
Hello everyone I have a text file with lines of text in English, but unfortunately some of the lines are in Afrikaans. I want to either remove the Afrikaans lines or create a list of all the Afrikaans lines (either option is good for me). Chat GPT claims to be able to do this, but as usual, it simply creates a list of lines that looks plausible, until you double-check it and discover that the bot had just made up a list that looks highly similar to the topic of the list of input l... See more Hello everyone I have a text file with lines of text in English, but unfortunately some of the lines are in Afrikaans. I want to either remove the Afrikaans lines or create a list of all the Afrikaans lines (either option is good for me). Chat GPT claims to be able to do this, but as usual, it simply creates a list of lines that looks plausible, until you double-check it and discover that the bot had just made up a list that looks highly similar to the topic of the list of input lines. Is there an AI (or technological) solution to do this? Thanks Samuel ▲ Collapse | | |
Stepan Konev Federaţia Rusă Local time: 17:05 din engleză în rusă Remove by language | Feb 27 |
Probably you can try to remove by language: select all, set 'Detect language automatically' in the Proofing Language settings and then replace all Afrikaans text with a blank 'Replace with' field.
[Edited at 2024-02-27 18:22 GMT] | | |
Neirda China Local time: 22:05 din chineză în franceză + ... An alternative to AI | May 23 |
If you can use Python, there's a few libraries you can use to detect the language in a text and optionnally do anything you want with it. What you can use ChatGPT for is walk you through the steps of doing that, it's simpler than you think. The catch is: - most of these libraries will probably not be too accurate with detecting Afrikaans and might mistake it with German. - You need a sample of at least a few dozen characters to eliminate false positives. ... See more If you can use Python, there's a few libraries you can use to detect the language in a text and optionnally do anything you want with it. What you can use ChatGPT for is walk you through the steps of doing that, it's simpler than you think. The catch is: - most of these libraries will probably not be too accurate with detecting Afrikaans and might mistake it with German. - You need a sample of at least a few dozen characters to eliminate false positives. These libraries are not related to AI but mostly work with "ngrams" (so called "trained data" with lots of samples of 3 to 4 letters, when you compare it to a corpus of text you can actually detect most languages pretty well). ▲ Collapse | | |
Hans Lenting Ţările de Jos Membru (2006) din germană în olandeză
Neirda wrote: - most of these libraries will probably not be too accurate with detecting Afrikaans and might mistake it with German. I assume that it is more likely that the language will be identified as Dutch. Ik neem aan dat het waarschijnlijker is dat de taal als Nederlands geïdentificeerd zal worden. Ek neem aan dat dit meer waarskynlik is dat die taal as Nederlands geïdentifiseer sal word. Since I’ve recently installed Python 3 on macOS Sonoma, I’d be grateful for a link to the Python scripts. | |
|
|
Neirda China Local time: 22:05 din chineză în franceză + ...
You have to do this yourself. Or ask ChatGPT to. I do not know the libraries in Python as I mostly use C sharp, but Python being the most popular coding language I'm sure they exist. This is what ChatGPT told me: In Python, there are several libraries available for language detection. Some of the most popular ones include: langdetect: This library is a port of Google's language-detection library. It's simple to use and supports many languages. ... See more You have to do this yourself. Or ask ChatGPT to. I do not know the libraries in Python as I mostly use C sharp, but Python being the most popular coding language I'm sure they exist. This is what ChatGPT told me: In Python, there are several libraries available for language detection. Some of the most popular ones include: langdetect: This library is a port of Google's language-detection library. It's simple to use and supports many languages. python from langdetect import detect text = "Bonjour tout le monde" language = detect(text) print(language) # Output: 'fr' langid: This library is another option for language identification. It also supports many languages and is quite straightforward to use. python import langid text = "Hello world" language, _ = langid.classify(text) print(language) # Output: 'en' polyglot: This library offers language detection as part of a larger suite of NLP tools. It requires installing some additional dependencies. python from polyglot.detect import Detector text = "Hola mundo" detector = Detector(text) print(detector.language.code) # Output: 'es' I'd start with that. Then you will also need to write your own routine for whatever you are trying to achieve. ▲ Collapse | | |
Pentru acest forum nu a fost numit niciun moderator.
Pentru a raporta încălcarea regulilor site-ului sau pentru a solicita asistență, vă rugăm să contactați
personalul site-ului »
Remove all lines in language X
TM-Town | Manage your TMs and Terms ... and boost your translation business
Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.
More info » |
|
Trados Business Manager Lite | Create customer quotes and invoices from within Trados Studio
Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.
More info » |
|