Details of this Paper

Unix

Description

solution


Question

3.;There is a magical kind of regexp you can use to remove runs of things. Try s/ ]*>//g;This says to remove every tag (literally, 's, followed by >) on each line.;YOUR TURN: Try s/ //g and explain why this removes much more than the intended HTML tags.;YOUR TURN: You could actually make HTML more readable by adding an extra " ___ " before each ">". Try this and show a few lines. Is it easier on the eyes?;4.;Go back to your Subaru data. There is a way to replace what you matched by putting a "&" in the replacement string. This is called a "backreference" and it works like this;Try the sed command "s/.*/&\n\n/;This says to match all characters on each line, and replace it with whatever matched, followed by two newlines. Did it work?;Try this: s/.*mi/\t\t\t\t&\n/;YOUR TURN: Why did the last expression work? Can you make it so that only the mileage on the car is tabbed over and followed by an extra empty line? The expression I gave is matching both the mileage on the car and the distance of the seller from Springfield, IL.;YOUR TURN: Can you write an s expression that moves over the miles AND the $ value? It would look something like s/\$[0-9]|[0-9] mi/\t\t\t\t&/ but you'll have to improve the matching part.;5.;Go to this page and grab all the data, ctrl-a, ctrl-c, and put it in the top box on your sed page: http://www.baseball-reference.com/leaders/WAR_top_ten.shtml;Now, we want to erase Barry Bonds from the record book, but not Bobby Bonds. This is not so easy, but we can do it. Actually I am not that angry at Barry Bonds -- I've got much more Rafael Palmeiro disgust --;First, we can just say s/Barry Bonds//;and we will remove him from each of the two lines that show Barry Bonds' career occurrences in the top ten each year in this valuable statistic (WAR).;YOUR TURN: Actually, can you remove Barry Bonds AND the associated stat, leaving just two empty lines? What s-expression did you use?;Now, we can remove Bonds from each of the years by noticing that the 80s, 90s, and 2000s belonged to Barry, and the 70s belonged to Bobby.;Try this: /^19[89][0-9]/ s/Bonds/XXXX/;This says to apply the s-expression which replaces Bonds with XXXX ONLY on those lines that start with a 1980s or 1990s year.;YOUR TURN: You can write an expression like the last one that catches all the 1980s, 1990s, AND 2000s?;YOUR TURN: Notice that we are redacting his name, but not removing the stat. Can you change the s-expression so it applies only to Barry's years, AND it completely removes his name and his parenthesized statistic?;6.;ADVANCED CHALLENGE;First, notice that you can chain your match-and-substitute/pattern/replacement expressions with a;So you could say;/Barry Bonds/ s/.*//, /^200[0-9]/ s/Bonds.....//g;and do both s-expressions in a single pass.;Except that the latter regexp match is not quite right.;YOUR TURN: Repair it.;YOUR TURN: Add an s-expression to change those annoying %09s into something like a _.;Some readings and videos will be posted soon for sed...

 

Paper#66052 | Written in 18-Jul-2015

Price : $22
SiteLock