Source term collector
Vestluse postitaja: Hans Lenting
Hans Lenting
Hans Lenting
Madalmaad
Liige (2006)
saksa - hollandi
Jul 6, 2024

Many CAT tools provide functions to list the frequent source terms of a project. This process usually produces a lot of garbage. Is there a program that only looks at the left and right of frequent nouns and then lists groups of two or three words?

 
Hans Lenting
Hans Lenting
Madalmaad
Liige (2006)
saksa - hollandi
TOPIC STARTER
Source fragment harvester Jul 7, 2024

I should have chosen "Source fragment harvester" as the subject.

Since there have been no replies to my post, I'd like to post an idea I've had since I posted it:

Use a regular expression to extract the candidates.

Sort in Excel and delete the noise.

Screenshot 2024-07-07 at 14.01.15

Screenshot 2024-07-07 at 14.00.59

[Bijgewerkt op 2024-07-07 12:20 GMT]


 
Hans Lenting
Hans Lenting
Madalmaad
Liige (2006)
saksa - hollandi
TOPIC STARTER
Got this suggestion Jul 8, 2024

A kind person gave me this suggestion:

sed -E "s/( a| all| allows| are| at| in| for| of| to| with| on| by| or| of| the| and| is| at)$//"


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Source term collector







TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Pastey
Your smart companion app

Pastey is an innovative desktop application that bridges the gap between human expertise and artificial intelligence. With intuitive keyboard shortcuts, Pastey transforms your source text into AI-powered draft translations.

Find out more »