|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--org.comedia.util.scanner.CScanner
Abstract class for different specific lexical scanners. Class provides general functionality and does not support specific keywords or datatypes.
Example of scanner usage:
System.out.println("*********** Scanner Test *************"); CScanner scanner = new CScanner(); scanner.setBuffer("while(1.0e2*i := \t\r\n> \"string\'\'\")\n" + "// comment\n/.*second\ncomment*./{xxx}"); scanner.setShowEol(true); scanner.setShowSpace(true); // Testing string convertions String str = "The test \"string\""; System.out.println("Start string: " + str); str = scanner.wrapString(str); System.out.println("Wrapped string: " + str); str = scanner.unwrapString(str); System.out.println("Unwrapped string: " + str); System.out.println(); System.out.println("Initial string: " + scanner.getBuffer()); while (scanner.lex() != EOF) { switch (scanner.getTokenType()) { case UNKNOWN: System.out.print("Type: Unknown "); break; case COMMENT: System.out.print("Type: Comment "); break; case KEYWORD: System.out.print("Type: Keyword "); break; case TYPE: System.out.print("Type: Type "); break; case IDENT: System.out.print("Type: Ident "); break; case ALPHA: System.out.print("Type: Alpha "); break; case OPERATOR: System.out.print("Type: Operator "); break; case BRACE: System.out.print("Type: Brace "); break; case SEPARATOR: System.out.print("Type: Separator "); break; case EOL: System.out.print("Type: Eol "); break; case LF: System.out.print("Type: Lf "); break; case SPACE: System.out.print("Type: Space "); break; case INT: System.out.print("Type: Int "); break; case FLOAT: System.out.print("Type: Float "); break; case STRING: System.out.print("Type: String "); break; case BOOL: System.out.print("Type: Bool "); break; case EOF: System.out.print("Type: Eof "); break; } System.out.println("Value: '" + scanner.getToken() + "' Pos: " + scanner.getPosition() + " Line: " + scanner.getLineNo()); }The result:
*********** Scanner Test ************* Start string: The test "string" Wrapped string: "The test "string"" Unwrapped string: The test "string" Initial string: while(1.0e2*i := > "string''") // comment /.second comment./{xxx} Type: Ident Value: 'while' Pos: 0 Line: 0 Type: Brace Value: '(' Pos: 5 Line: 0 Type: Float Value: '1.0e2' Pos: 6 Line: 0 Type: Operator Value: '*' Pos: 11 Line: 0 Type: Ident Value: 'i' Pos: 12 Line: 0 Type: Space Value: ' ' Pos: 13 Line: 0 Type: Separator Value: ':' Pos: 14 Line: 0 Type: Operator Value: '=' Pos: 15 Line: 0 Type: Space Value: ' ' Pos: 16 Line: 0 Type: Lf Value: ' ' Pos: 18 Line: 0 Type: Eol Value: ' ' Pos: 19 Line: 0 Type: Operator Value: '>' Pos: 20 Line: 1 Type: Space Value: ' ' Pos: 21 Line: 1 Type: String Value: '"string''"' Pos: 22 Line: 1 Type: Brace Value: ')' Pos: 32 Line: 1 Type: Eol Value: ' ' Pos: 33 Line: 1 Type: Operator Value: '/' Pos: 34 Line: 2 Type: Operator Value: '/' Pos: 35 Line: 2 Type: Space Value: ' ' Pos: 36 Line: 2 Type: Ident Value: 'comment' Pos: 37 Line: 2 Type: Eol Value: ' ' Pos: 44 Line: 2 Type: Operator Value: '/' Pos: 45 Line: 3 Type: Operator Value: '*' Pos: 46 Line: 3 Type: Ident Value: 'second' Pos: 47 Line: 3 Type: Eol Value: ' ' Pos: 53 Line: 3 Type: Ident Value: 'comment' Pos: 54 Line: 4 Type: Operator Value: '*' Pos: 61 Line: 4 Type: Operator Value: '/' Pos: 62 Line: 4 Type: Brace Value: '{' Pos: 63 Line: 4 Type: Ident Value: 'xxx' Pos: 64 Line: 4 Type: Brace Value: '}' Pos: 67 Line: 4
Inner Class Summary | |
protected class |
CScanner.Lexem
Presents extracted token with information about token type and position in input stream. |
Field Summary | |
static int |
ALPHA
Constant which covers COMMENT, KEYWORD, TYPE or
IDENT tokens. |
static int |
BOOL
Boolean constant token. |
static int |
BRACE
Different brace token constant. |
protected java.lang.String |
buffer
Buffer which contains input stream. |
protected int |
bufferLen
The length of the input stream. |
protected int |
bufferLine
Current precessed line in the input stream. |
protected int |
bufferPos
Pointer to current position in the input stream. |
static int |
COMMENT
Comment string token constant. |
static int |
CONST
Constant which covers all token constants: INT, FLOAT, STRING
and BOOL |
protected CScanner.Lexem |
current
"Holder" class which contains current extracted token. |
static int |
DELIM
Constant which covers OPERATOR, BRACE, SEPARATOR, EOL, LN
and SPACE tokens. |
static int |
EOF
End-Of-File token constant. |
static int |
EOL
CHAR(13) token constant. |
static int |
FLOAT
Float constant token. |
static int |
IDENT
Identifier token constant. |
static int |
INT
Integer constant token. |
static int |
KEYWORD
Keyword token constant. |
protected java.lang.String[] |
keywords
List of language specified reserved keywords. |
static int |
LF
CHAR(10) token constant. |
protected CScanner.Lexem |
next
"Holder" class which contains next available token. |
static int |
OPERATOR
Operator token constant. |
protected java.lang.String[] |
operators
List of language specified operators. |
static int |
SEPARATOR
Different lexem seperators token constant. |
protected boolean |
showComment
It means show or hide comment tokens. |
protected boolean |
showEol
It means show or hide EOL/LF tokens. |
protected boolean |
showKeyword
It shows do make a search for keywords or present them as identifiers. |
protected boolean |
showSpace
It means show or hide space tokens. |
protected boolean |
showString
It shows how to present extracted string tokens: in ordinal or escape format. |
protected boolean |
showType
It shows do make a search for data type keywords or present them as identifiers. |
static int |
SPACE
Space token constant. |
static int |
STRING
String constant token. |
static int |
TYPE
Data type keyword token constant. |
protected java.lang.String[] |
types
List of language specified data type keywords. |
static int |
UNKNOWN
Unknown token constant. |
Constructor Summary | |
CScanner()
Default class constructor. |
Method Summary | |
protected void |
extractNextToken()
Extracts "next" token from the input stream. |
protected void |
extractToken()
Extract "current" token or copies it from "next" token if it is available. |
java.lang.String |
getBuffer()
Gets an input buffer string. |
int |
getBufferPos()
Gets a current position in the input stream. |
int |
getLineNo()
Gets a line number of the first character of a current token. |
int |
getNextLineNo()
Gets a line number of the first character of a next token. |
int |
getNextPosition()
Gets position ot the first character of a next token in the input stream. |
java.lang.String |
getNextToken()
Gets a next token value. |
int |
getNextTokenType()
Gets a next token type represented by special constant. |
int |
getPosition()
Gets position ot the first character of a current token in the input stream. |
java.lang.String |
getToken()
Gets a current token value. |
int |
getTokenType()
Gets a current token type represented by special constant. |
int |
gotoNextToken()
Continues the parsing process and extracts a current token. |
protected int |
innerProcCComment(CScanner.Lexem curr)
Parses C-like multi-line comment. |
protected int |
innerProcCString(CScanner.Lexem curr)
Parses C-like escape string. |
protected int |
innerProcIdent(CScanner.Lexem curr)
Parses an identificator or numeric constant tokens. |
protected int |
innerProcLineComment(CScanner.Lexem curr)
Processes the rest single-line comment. |
protected int |
innerProcPasString(CScanner.Lexem curr)
Parses Pascal-like escape string. |
protected int |
innerProcString(CScanner.Lexem curr)
Parses a string. |
protected int |
innerStartLex(CScanner.Lexem curr)
Starts the first stage of lexical parsing. |
static boolean |
isAlpha(char c)
Checks is character an alpha. |
static boolean |
isDelim(char c)
Checks is character a delimiter. |
static boolean |
isDigit(char c)
Checks is character a digit. |
static boolean |
isEol(char c)
Checks is character EOL (CHAR(13) symbol. |
static boolean |
isLetter(char c)
Checks is character a letter. |
static boolean |
isQuote(char c)
Checks is character a quote. |
boolean |
isShowComment()
Gets a ShowComment property value. |
boolean |
isShowEol()
Gets a ShowEol property value. |
boolean |
isShowKeyword()
Gets a ShowKeyword property value. |
boolean |
isShowSpace()
Gets a ShowSpace property value. |
boolean |
isShowString()
Gets a ShowString property value. |
boolean |
isShowType()
Gets a ShowType property value. |
static boolean |
isWhite(char c)
Checks is character a white space. |
int |
lex()
Starts the parsing process and extract a current token. |
protected int |
lowRunLex(CScanner.Lexem curr)
Gets a lowlevel token. |
static void |
main(java.lang.String[] args)
The main function for test purposes. |
void |
restart()
Restarts the parsing process by reassinging the same input buffer. |
protected int |
runLex(CScanner.Lexem curr)
Extracts next available token from the input stream. |
protected boolean |
searchForString(java.lang.String s,
java.lang.String[] a)
Searches a string value inside a string array. |
void |
setBuffer(java.lang.String s)
Sets a new input buffer and resets buffer pointers. |
void |
setShowComment(boolean value)
Sets a new ShowComment property value. |
void |
setShowEol(boolean value)
Sets a new ShowEol property value. |
void |
setShowKeyword(boolean value)
Sets a new ShowKeyword property value. |
void |
setShowSpace(boolean value)
Sets a new ShowSpace property value. |
void |
setShowString(boolean value)
Sets a new ShowString property value. |
void |
setShowType(boolean value)
Sets a new ShowType property value. |
static java.lang.String |
unwrapString(java.lang.String s)
Converts a string from special escape format limited with quotes into oridinary (local) presentation. |
static java.lang.String |
wrapString(java.lang.String s)
Converts a string from ordinary into escape format limited with quotes. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
public static final int UNKNOWN
public static final int COMMENT
public static final int KEYWORD
public static final int TYPE
public static final int IDENT
public static final int ALPHA
COMMENT, KEYWORD, TYPE
or
IDENT
tokens.public static final int OPERATOR
public static final int BRACE
public static final int SEPARATOR
public static final int EOL
public static final int LF
public static final int SPACE
public static final int DELIM
OPERATOR, BRACE, SEPARATOR, EOL, LN
and SPACE
tokens.public static final int INT
public static final int FLOAT
public static final int STRING
public static final int BOOL
public static final int CONST
INT, FLOAT, STRING
and BOOL
public static final int EOF
protected java.lang.String buffer
protected int bufferPos
protected int bufferLine
protected int bufferLen
protected CScanner.Lexem current
protected CScanner.Lexem next
protected boolean showComment
FALSE
by default.protected boolean showString
TRUE
means to present string in escape format.
It is TRUE
by default.protected boolean showEol
FALSE
by default.protected boolean showKeyword
TRUE
by default.protected boolean showType
TRUE
by default.protected boolean showSpace
FALSE
by default.protected java.lang.String[] operators
protected java.lang.String[] types
protected java.lang.String[] keywords
Constructor Detail |
public CScanner()
Method Detail |
protected int lowRunLex(CScanner.Lexem curr)
curr
- a "Holder" which containes extracted token.protected int runLex(CScanner.Lexem curr)
curr
- a "Holder" class which contains extracted token.protected void extractToken()
protected void extractNextToken()
protected int innerStartLex(CScanner.Lexem curr)
curr
- a "Holder" class which contains an extracting token.protected int innerProcLineComment(CScanner.Lexem curr)
curr
- a "Holder" class whci contains an extracting token.protected int innerProcCComment(CScanner.Lexem curr)
curr
- a "Holder" class whci contains an extracting token.protected int innerProcIdent(CScanner.Lexem curr)
curr
- a "Holder" class which contains an extracting token.protected int innerProcString(CScanner.Lexem curr)
curr
- a "Holder" class which contains an extracting token.protected int innerProcCString(CScanner.Lexem curr)
curr
- a "Holder" class which contains an extracting token.protected int innerProcPasString(CScanner.Lexem curr)
curr
- a "Holder" class which contains an extracting token.protected boolean searchForString(java.lang.String s, java.lang.String[] a)
s
- a searching string value.a
- a string array.public void restart()
public static java.lang.String wrapString(java.lang.String s)
s
- a string in ordinary (local) presentation.public static java.lang.String unwrapString(java.lang.String s)
s
- a string in special escape format.public static boolean isAlpha(char c)
c
- a checking character.public static boolean isLetter(char c)
c
- a checking character.public static boolean isDigit(char c)
public static boolean isDelim(char c)
c
- a checking character.public static boolean isWhite(char c)
c
- a checking character.public static boolean isEol(char c)
c
- a checking character.public static boolean isQuote(char c)
c
- a checking character.public int lex()
public int gotoNextToken()
lex
method.public boolean isShowComment()
ShowComment
property value.
ShowComment
means show or hide comment tokens.
It is FALSE
by default.public void setShowComment(boolean value)
ShowComment
property value.
ShowComment
means show or hide comment tokens.
It is FALSE
by default.value
- a new ShowComment
property value.public boolean isShowEol()
ShowEol
property value.
ShowEol
means show or hide EOL/LF tokens.
It is FALSE
by default.public void setShowEol(boolean value)
ShowEol
property value.
ShowEol
means show or hide EOL/LF tokens.
It is FALSE
by default.value
- a new ShowEol
property value.public boolean isShowString()
ShowString
property value.
ShowString
shows how to present extracted string tokens:
in ordinal or escape format. TRUE means to present string
in escape format. It is TRUE
by default.
public void setShowString(boolean value)
ShowString
property value.
ShowString
shows how to present extracted string tokens:
in ordinal or escape format. TRUE means to present string
in escape format. It is TRUE
by default.
- Parameters:
value
- a new ShowString
property value.
public boolean isShowKeyword()
ShowKeyword
property value.
ShowKeyword
shows do make a search for keywords or present
them as identifiers. It is TRUE
by default.public void setShowKeyword(boolean value)
ShowKeyword
property value.
ShowKeyword
shows do make a search for keywords or present
them as identifiers. It is TRUE
by default.value
- a new ShowKeyword
property value.public boolean isShowType()
ShowType
property value.
ShowType
shows do make a search for data type keywords
or present them as identifiers. ShowType
is TRUE
by default.public void setShowType(boolean value)
ShowType
property value.
ShowType
shows do make a search for data type keywords
or present them as identifiers. ShowType
is TRUE
by default.value
- a new ShowType
property value.public boolean isShowSpace()
ShowSpace
property value.
ShowSpace
means show or hide space tokens.
It is FALSE
by default.public void setShowSpace(boolean value)
ShowSpace
property value.
ShowSpace
means show or hide space tokens.
It is FALSE
by default.value
- a new ShowSpace
property value.public java.lang.String getBuffer()
public void setBuffer(java.lang.String s)
s
- a new input stream.public int getBufferPos()
public int getPosition()
public int getLineNo()
public java.lang.String getToken()
public int getTokenType()
public int getNextPosition()
public int getNextLineNo()
public java.lang.String getNextToken()
public int getNextTokenType()
public static void main(java.lang.String[] args)
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |