Compare the text of two columns

Sboorn · July 12, 2018, 10:44am

Hi to everyone!

I’m working on this dataset to cleaning it
https://www.kaggle.com/heesoo37/120-years-of-olympic-history-athletes-and-results

Now I’m trying to delete from the column Event the words that are contained in the column Sport. (I’ve already done some modification indeed in the column Event I’ve deleted the occurance of the words “man’s” and “women’s”).

Attached you’ll find

The current situtation and the desired result:

How can I solve the problem?

I hope I have been clear, Thank you in advance for help.

jason · July 13, 2018, 3:17pm

Hey @Sboorn,

To remove the duplicated sport prefix from the event field, select the drop down arrow on the Event column, choose calculated field, and paste the below SQL snippet in the text box. Leaving the other two fields as their defaults will replace the existing Event field with this new definition, effectively deleting the extra Sport value from the Event field.

SUBSTR(Event, STRPOS(Sport, Event) + LENGTH(Sport) + 2)

Topic		Replies	Views
Before remove format, I still could see the deleted column	3	825	January 7, 2021
UNION not removing duplicates	5	3103	February 7, 2019
This query cannot be decorrelated	2	150	August 12, 2024
Drop multiples columns VDS	2	1307	September 20, 2019
Split string to columns by comma delimiter	1	344	February 10, 2024

Compare the text of two columns

Related topics