Monday, September 6, 2010

Code to Understand Objective 3.5

Courtesy of SCJP Sun® Certified Programmer for Java™ 6 Study Guide Exam (310-065) (9780071591065)

Objective 3.5: Write code that uses standard J2SE APIs in the java.util and java.util.regex packages to format or parse strings or streams. For strings, write code that uses the Pattern and Matcher classes and the String.split method. Recognize and use regular expression patterns for matching (limited to: .(dot), *(star), +(plus), ?, \d, \s, \w, [ ], ()). The use of *, +, and ? will be limited to greedy quantifiers, and the parenthesis operator will only be used as a grouping mechanism, not for capturing content during matching. For streams, write code using the Formatter and Scanner classes and the PrintWriter.format/printf methods. Recognize and use formatting parameters (limited to: %b, %c, %d, %f, %s) in format strings.

Locating Data via Pattern Matching

Invoke jvm with % java RegexCode <pattern> <matcher>
For example % java RegexCode "\\d\\d\\d" "ab456sd789fsd23423"
This will output:
Pattern is \d\d\d
2 5 456
7 10 789
13 16 234

Searching Using the Scanner Class

Invoke jvm with % java ScanIn <pattern> 
For example % java ScanIn "\\d\\d\\d" 
When prompted with % input: enter  <matcher>
For example % input: 1b2c335f456

Tokenizing with String.split()

The String class’s split() method takes a regex expression as its argument, and returns a String array populated with the tokens produced by the split (or tokenizing) process. This is a handy way to tokenize relatively small pieces of data. The following program uses args[0] to hold a source string, and args[1] to hold the regex pattern to use as a delimiter.

For example:

java SplitTest "ab5 ccc 45 @" "\d"

Will output

count 4
> ccc <
> @<

Tokenizing with Scanner

Demonstrates several of Scanner’s methods and capabilities. Scanner’s default delimiter is whitespace, which this program uses. The program makes two Scanner objects: s1 is iterated over with the more generic next() method, which returns every token as a String, while s2 is analyzed with several of the specialized nextXxx() methods (where Xxx is a primitive type)

If jvm is invoked with java ScanNext "1 true 34 hi"
Output will be hits  ssssibis2

Formatting with printf() and format()

Construction of format strings:

%[arg_index$][flags][width][.precision]conversion char

arg_index An integer followed directly by a $, this indicates which argument should be printed in this position.

flags While many flags are available, for the exam you’ll need to know:

    * “-” Left justify this argument
    * “+” Include a sign (+ or -) with this argument
    * “0” Pad this argument with zeroes
    * “,” Use locale-specific grouping separators (i.e., the comma in 123,456)
    * “(” Enclose negative numbers in parentheses

width This value indicates the minimum number of characters to print. (If you want nice even columns, you’ll use this value extensively.)

precision For the exam you’ll only need this when formatting a floating-point number, and in the case of floating point numbers, precision indicates the number of digits to print after the decimal point conversion The type of argument you’ll be formatting. You’ll need to know:

    * b boolean
    * c char
    * d integer
    * f floating point
    * s string

No comments:

Post a Comment