Remove UTM code from the end of HTML link in google sheets

by Chris McMahon   Last Updated January 11, 2018 14:03 PM

I have a question about Google Sheets.

I have a link that I pull in elsewhere. Let's say it looks like this: https://www.mysite.com/site/interesting-content/name-of-article.html?utm_source=source&utm_campaign=ab&utm_medium=medium&utm_term=paid

I've been able to truncate it down to /site/interesting-content/name-of-article.html?utm_source=source&utm_campaign=ab&utm_medium=medium&utm_term=paid using a substitute function.

But i'd like to further truncate it down to /site/interesting-content/name-of-article.html

The problem is, there's quite a few UTM campaigns that I can pull links from. So ideally, I'd like to remove anything after .html, without having to specify EXACTLY what needs to be removed.

I tried using a right function to get a copy of what I want to remove, so that in another cell I could do a substitute function on it to remove the unneeded bits:

=RIGHT(W8,SEARCH(".html",W8))

But it ended up looking like this: ent/name-of-article.html?utm_source=source&utm_campaign=ab&utm_medium=medium&utm_term=paid



Answers 1


Try replacing with REGEXREPLACE:

=REGEXREPLACE(W8,"\?[^?]*$","")

Explanation

REGEXREPLACE takes string (first parameter), then using pattern (second parameter) finds text that starts with ? until end of string, and replaces it with empty string (third parameter).

Kos
Kos
January 11, 2018 14:02 PM

Related Questions





Concatenate in side an arrayformula with google sheets

Updated January 10, 2018 23:03 PM

Counting cells with a certain background?

Updated January 25, 2018 03:03 AM