Classes

parsing.h File Reference

#include "common/common.h"
#include <iostream>
Include dependency graph for parsing.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

struct  utf8_char_t
 a UTF-8 character. More...



#define isSpace(c)   isspace(c)
enum  eParseBehavior {
  eParse_None = 0x0000,
  eParse_StripComments = 0x0001,
  eParse_StripBogus = 0x0002,
  eParse_RespectQuotes = 0x0004,
  eParse_Strip = 0x0003,
  eParse_Invalid = 0x8000
}
 

eParseBehavior: these can be OR'd together

More...
typedef Dictionary dictionary_t
 dictionary routines These let you construct a dictionary (key -> value map) based on an input string, and then look for required/optional values.
bool isSingleByteUTF8Character (IN char a) throw ()
 is this character a single-byte (ASCII) UTF-8 character?
bool isMultiByteUTF8Character (IN char a) throw ()
 is this character the leading byte of a multi-byte UTF-8 character?
bool isLeadingUTF8Byte (IN char a) throw ()
 is this character a (likely) leading byte for a UTF-8 character?
bool isTrailingUTF8Byte (IN char a) throw ()
 is this character a trailing byte for a multi-byte UTF-8 character?
bool getUTF8CharacterFromStream (IN std::istream &stream, OUT utf8_char_t &c)
 get the next (UTF-8) character from the stream returns false on eof (end of stream)
const char * getUTF8CharacterFromString (IN const char *input, OUT utf8_char_t &c)
 get the next (UTF-8) character from the given string returns a pointer to the next character in the string
int getUTF8ByteCount (IN char a)
 given an initial byte of a (potentially) multi-byte UTF-8 character, return how many bytes total in the character returns -1 on error.
const char * getNextWord (IN const char *text, OUT std::string &word)
 retrieves the next word from the specified line of text, and returns a pointer to the first character in the line after the word.
std::string getNextLineFromStream (IN std::istream &stream, IN eParseBehavior behavior)
 Reads to the next newline or end of stream, and stashes all characters (except the final newline) in the output string.
const char * getNextTokenFromString (IN const char *input, OUT std::string &token, IN eParseBehavior)
 finds the next whitespace-delimited token, and puts it in the given token (std::string).
const char * getNextTokenFromString (IN const char *input, IO char *buffer, IN int buffer_size, IN eParseBehavior, OUT int &chars) throw ()
 finds the next whitespace-delimited token, and puts it in the given buffer.
const char * expectFromString (IN const char *input, IN const char *expect, IN eParseBehavior)
 expects the given token to be next in the string.
const char * getFloatFromString (IN const char *input, OUT float &x) throw ()
 helper method to extract a float (assuming it is the next token) This routine is UTF-8 compliant
int readFloatsFromString (IN const char *input, IN int nFloats, OUT float *output) throw ()
 helper method to read up to N floats.
bool getBooleanFromString (IN const char *input) throw ()
 tries to determine if the given string is a true or false value.
bool isBogus (IN char a) throw ()
 true if the given character is bogus ('\r', etc)
void getDictionaryFromString (IN const char *string, IN const char *debug_info, OUT dictionary_t &data)
const char * getValue (IN const dictionary_t &, IN const char *key)
const char * getRequiredValue (IN const dictionary_t &, IN const char *key)
const char * getOptionalValue (IN const dictionary_t &, IN const char *key, IN const char *default_value) throw ()

Define Documentation

#define isSpace (   c  )     isspace(c)

Definition at line 166 of file parsing.h.


Typedef Documentation

typedef Dictionary dictionary_t

dictionary routines These let you construct a dictionary (key -> value map) based on an input string, and then look for required/optional values.

Definition at line 274 of file parsing.h.


Enumeration Type Documentation

eParseBehavior: these can be OR'd together

Enumerator:
eParse_None 

no special behavior

eParse_StripComments 

strip comments (start with #)

eParse_StripBogus 

strip bogus characters ('\r')

eParse_RespectQuotes 

treat quoted tokens as a unit

eParse_Strip 

strip comments + bogus

eParse_Invalid 

Definition at line 181 of file parsing.h.


Function Documentation

bool isSingleByteUTF8Character ( IN char  a  )  throw () [inline]

is this character a single-byte (ASCII) UTF-8 character?

Definition at line 103 of file parsing.h.

bool isMultiByteUTF8Character ( IN char  a  )  throw () [inline]

is this character the leading byte of a multi-byte UTF-8 character?

Definition at line 112 of file parsing.h.

bool isLeadingUTF8Byte ( IN char  a  )  throw () [inline]

is this character a (likely) leading byte for a UTF-8 character?

Definition at line 121 of file parsing.h.

bool isTrailingUTF8Byte ( IN char  a  )  throw () [inline]

is this character a trailing byte for a multi-byte UTF-8 character?

Definition at line 131 of file parsing.h.

bool getUTF8CharacterFromStream ( IN std::istream &  stream,
OUT utf8_char_t c 
)

get the next (UTF-8) character from the stream returns false on eof (end of stream)

Definition at line 131 of file parsing.cpp.

const char* getUTF8CharacterFromString ( IN const char *  input,
OUT utf8_char_t c 
)

get the next (UTF-8) character from the given string returns a pointer to the next character in the string

Definition at line 178 of file parsing.cpp.

int getUTF8ByteCount ( IN char  a  ) 

given an initial byte of a (potentially) multi-byte UTF-8 character, return how many bytes total in the character returns -1 on error.

Definition at line 222 of file parsing.cpp.

const char* getNextWord ( IN const char *  text,
OUT std::string &  word 
)

retrieves the next word from the specified line of text, and returns a pointer to the first character in the line after the word.

Leading whitespace will be ignored when parsing. The word returned will either be empty, a newline, or a set of non-space characters. If the returned word is empty, then NULL is also the return value (word can only be empty at end-of-line). Note that you can have non-empty words returned with a NULL return value.

Definition at line 701 of file parsing.cpp.

std::string getNextLineFromStream ( IN std::istream &  stream,
IN eParseBehavior  behavior 
)

Reads to the next newline or end of stream, and stashes all characters (except the final newline) in the output string.

Caller can ask to have comments and/or bogus characters stripped out. returns a std::string containing the line. throws on errors This routine is UTF-8 compliant.

Definition at line 254 of file parsing.cpp.

const char* getNextTokenFromString ( IN const char *  input,
OUT std::string &  token,
IN  eParseBehavior 
)

finds the next whitespace-delimited token, and puts it in the given token (std::string).

returns a pointer to the character (whitespace or end of string) right after the token. This routine is UTF-8 compliant.

Definition at line 308 of file parsing.cpp.

const char* getNextTokenFromString ( IN const char *  input,
IO char *  buffer,
IN int  buffer_size,
IN  eParseBehavior,
OUT int &  chars 
) throw ()

finds the next whitespace-delimited token, and puts it in the given buffer.

returns a pointer to the character right after the token the chars parameter returns how many characters are in the buffer. the buffer is null-terminated if the buffer wasn't big enough for the token, the token will be truncated. The buffer will be null-terminated (the routine saves enough room for the final null). chars will contain the number of characters parsed from the string, even if not all of them were copied into the buffer. So you can detect truncation if chars >= buffer_size. This routine is UTF-8 compliant

Definition at line 364 of file parsing.cpp.

const char* expectFromString ( IN const char *  input,
IN const char *  expect,
IN  eParseBehavior 
)

expects the given token to be next in the string.

Throws if the next token is something else. Returns a pointer to the character immediately following the expected token.

Definition at line 407 of file parsing.cpp.

const char* getFloatFromString ( IN const char *  input,
OUT float &  x 
) throw ()

helper method to extract a float (assuming it is the next token) This routine is UTF-8 compliant

Definition at line 619 of file parsing.cpp.

int readFloatsFromString ( IN const char *  input,
IN int  nFloats,
OUT float *  output 
) throw ()

helper method to read up to N floats.

Returns the number read. Assumes that that caller doesn't need the string returned! The caller must provide a pre-allocated array of floats for output.

Definition at line 648 of file parsing.cpp.

bool getBooleanFromString ( IN const char *  input  )  throw ()

tries to determine if the given string is a true or false value.

This is NOT localized. This is for machine-parsed input only, such as config files. A string is considered false if it is the word "false" (any case), the single letter "F", the character "0", or empty. Any other value is considered true. Note that expressions that evaluate to zero but are not a single digit, such as "-0" or "+0", are considered true.

Definition at line 675 of file parsing.cpp.

bool isBogus ( IN char  a  )  throw ()

true if the given character is bogus ('\r', etc)

Definition at line 442 of file parsing.cpp.

void getDictionaryFromString ( IN const char *  string,
IN const char *  debug_info,
OUT dictionary_t data 
)

Definition at line 524 of file parsing.cpp.

const char* getValue ( IN const dictionary_t ,
IN const char *  key 
)

Definition at line 561 of file parsing.cpp.

const char* getRequiredValue ( IN const dictionary_t ,
IN const char *  key 
)

Definition at line 576 of file parsing.cpp.

const char* getOptionalValue ( IN const dictionary_t ,
IN const char *  key,
IN const char *  default_value 
) throw ()

Definition at line 600 of file parsing.cpp.