Copyright | © 2015–2016 Megaparsec contributors © 2007 Paolo Martini © 1999–2001 Daan Leijen |
---|---|
License | FreeBSD |
Maintainer | Mark Karpov <[email protected]> |
Stability | experimental |
Portability | non-portable |
Safe Haskell | None |
Language | Haskell2010 |
Text.Megaparsec.Prim
Description
The primitive parser combinators.
- data State s = State {
- stateInput :: s
- statePos :: NonEmpty SourcePos
- stateTabWidth :: Pos
- class Ord (Token s) => Stream s where
- type Parsec e s = ParsecT e s Identity
- data ParsecT e s m a
- class (ErrorComponent e, Stream s, Alternative m, MonadPlus m) => MonadParsec e s m | m -> e s where
- (<?>) :: MonadParsec e s m => m a -> String -> m a
- unexpected :: MonadParsec e s m => ErrorItem (Token s) -> m a
- getInput :: MonadParsec e s m => m s
- setInput :: MonadParsec e s m => s -> m ()
- getPosition :: MonadParsec e s m => m SourcePos
- setPosition :: MonadParsec e s m => SourcePos -> m ()
- pushPosition :: MonadParsec e s m => SourcePos -> m ()
- popPosition :: MonadParsec e s m => m ()
- getTabWidth :: MonadParsec e s m => m Pos
- setTabWidth :: MonadParsec e s m => Pos -> m ()
- setParserState :: MonadParsec e s m => State s -> m ()
- runParser :: Parsec e s a -> String -> s -> Either (ParseError (Token s) e) a
- runParser' :: Parsec e s a -> State s -> (State s, Either (ParseError (Token s) e) a)
- runParserT :: Monad m => ParsecT e s m a -> String -> s -> m (Either (ParseError (Token s) e) a)
- runParserT' :: Monad m => ParsecT e s m a -> State s -> m (State s, Either (ParseError (Token s) e) a)
- parse :: Parsec e s a -> String -> s -> Either (ParseError (Token s) e) a
- parseMaybe :: (ErrorComponent e, Stream s) => Parsec e s a -> s -> Maybe a
- parseTest :: (ShowErrorComponent e, Ord (Token s), ShowToken (Token s), Show a) => Parsec e s a -> s -> IO ()
Data types
This is Megaparsec's state, it's parametrized over stream type s
.
Constructors
State | |
Fields
|
class Ord (Token s) => Stream s where Source #
An instance of Stream s
has stream type s
. Token type is determined
by the stream and can be found via Token
type function.
Methods
uncons :: s -> Maybe (Token s, s) Source #
Get next token from the stream. If the stream is empty, return
Nothing
.
updatePos :: Proxy s -> Pos -> SourcePos -> Token s -> (SourcePos, SourcePos) Source #
Update position in stream given tab width, current position, and current token. The result is a tuple where the first element will be used to report parse errors for current token, while the second element is the incremented position that will be stored in parser's state.
When you work with streams where elements do not contain information
about their position in input, result is usually consists of the third
argument unchanged and incremented position calculated with respect to
current token. This is how default instances of Stream
work (they use
defaultUpdatePos
, which may be a good starting point for your own
position-advancing function).
When you wish to deal with stream of tokens where every token “knows” its start and end position in input (for example, you have produced the stream with Happy/Alex), then the best strategy is to use the start position as actual element position and provide the end position of the token as incremented one.
Since: 5.0.0
type Parsec e s = ParsecT e s Identity Source #
Parsec
is non-transformer variant of more general ParsecT
monad transformer.
ParsecT e s m a
is a parser with custom data component of error e
,
stream type s
, underlying monad m
and return type a
.
Instances
(ErrorComponent e, Stream s) => MonadParsec e s (ParsecT e s m) Source # | |
(ErrorComponent e, Stream s, MonadError e' m) => MonadError e' (ParsecT e s m) Source # | |
(ErrorComponent e, Stream s, MonadReader r m) => MonadReader r (ParsecT e s m) Source # | |
(ErrorComponent e, Stream s, MonadState st m) => MonadState st (ParsecT e s m) Source # | |
MonadTrans (ParsecT e s) Source # | |
(ErrorComponent e, Stream s) => Monad (ParsecT e s m) Source # | |
Functor (ParsecT e s m) Source # | |
(ErrorComponent e, Stream s) => MonadFail (ParsecT e s m) Source # | |
(ErrorComponent e, Stream s) => Applicative (ParsecT e s m) Source # | |
(ErrorComponent e, Stream s, MonadIO m) => MonadIO (ParsecT e s m) Source # | |
(ErrorComponent e, Stream s) => Alternative (ParsecT e s m) Source # | |
(ErrorComponent e, Stream s) => MonadPlus (ParsecT e s m) Source # | |
(ErrorComponent e, Stream s, MonadCont m) => MonadCont (ParsecT e s m) Source # | |
Primitive combinators
class (ErrorComponent e, Stream s, Alternative m, MonadPlus m) => MonadParsec e s m | m -> e s where Source #
Type class describing parsers independent of input type.
Minimal complete definition
failure, label, try, lookAhead, notFollowedBy, withRecovery, eof, token, tokens, getParserState, updateParserState
Methods
failure :: Set (ErrorItem (Token s)) -> Set (ErrorItem (Token s)) -> Set e -> m a Source #
The most general way to stop parsing and report ParseError
.
unexpected
is defined in terms of this function:
unexpected item = failure (Set.singleton item) Set.empty Set.empty
Since: 4.2.0
label :: String -> m a -> m a Source #
The parser label name p
behaves as parser p
, but whenever the
parser p
fails without consuming any input, it replaces names of
“expected” tokens with the name name
.
hidden p
behaves just like parser p
, but it doesn't show any
“expected” tokens in error message when p
fails.
The parser try p
behaves like parser p
, except that it
pretends that it hasn't consumed any input when an error occurs.
This combinator is used whenever arbitrary look ahead is needed. Since
it pretends that it hasn't consumed any input when p
fails, the
(<|>
) combinator will try its second alternative even when the
first parser failed while consuming input.
For example, here is a parser that is supposed to parse word “let” or “lexical”:
>>>
parseTest (string "let" <|> string "lexical") "lexical"
1:1: unexpected "lex" expecting "let"
What happens here? First parser consumes “le” and fails (because it
doesn't see a “t”). The second parser, however, isn't tried, since the
first parser has already consumed some input! try
fixes this behavior
and allows backtracking to work:
>>>
parseTest (try (string "let") <|> string "lexical") "lexical"
"lexical"
try
also improves error messages in case of overlapping alternatives,
because Megaparsec's hint system can be used:
>>>
parseTest (try (string "let") <|> string "lexical") "le"
1:1: unexpected "le" expecting "let" or "lexical"
Please note that as of Megaparsec 4.4.0, string
backtracks
automatically (see tokens
), so it does not need try
. However, the
examples above demonstrate the idea behind try
so well that it was
decided to keep them.
lookAhead :: m a -> m a Source #
lookAhead p
parses p
without consuming any input.
If p
fails and consumes some input, so does lookAhead
. Combine with
try
if this is undesirable.
notFollowedBy :: m a -> m () Source #
notFollowedBy p
only succeeds when parser p
fails. This parser
does not consume any input and can be used to implement the “longest
match” rule.
withRecovery :: (ParseError (Token s) e -> m a) -> m a -> m a Source #
withRecovery r p
allows continue parsing even if parser p
fails.
In this case r
is called with actual ParseError
as its argument.
Typical usage is to return value signifying failure to parse this
particular object and to consume some part of input up to start of next
object.
Note that if r
fails, original error message is reported as if
without withRecovery
. In no way recovering parser r
can influence
error messages.
Since: 4.4.0
This parser only succeeds at the end of the input.
token :: (Token s -> Either (Set (ErrorItem (Token s)), Set (ErrorItem (Token s)), Set e) a) -> Maybe (Token s) -> m a Source #
The parser token test mrep
accepts a token t
with result x
when
the function test t
returns
. Right
xmrep
may provide
representation of the token to report in error messages when input
stream in empty.
This is the most primitive combinator for accepting tokens. For
example, the satisfy
parser is implemented as:
satisfy f = token testChar Nothing where testChar x = if f x then Right x else Left (Set.singleton (Tokens (x:|[])), Set.empty, Set.empty)
tokens :: (Token s -> Token s -> Bool) -> [Token s] -> m [Token s] Source #
The parser tokens test
parses list of tokens and returns it.
Supplied predicate test
is used to check equality of given and parsed
tokens.
This can be used for example to write string
:
string = tokens (==)
Note that beginning from Megaparsec 4.4.0, this is an auto-backtracking
primitive, which means that if it fails, it never consumes any
input. This is done to make its consumption model match how error
messages for this primitive are reported (which becomes an important
thing as user gets more control with primitives like withRecovery
):
>>>
parseTest (string "abc") "abd"
1:1: unexpected "abd" expecting "abc"
This means, in particular, that it's no longer necessary to use try
with tokens
-based parsers, such as string
and
string'
. This feature does not affect
performance in any way.
getParserState :: m (State s) Source #
Returns the full parser state as a State
record.
updateParserState :: (State s -> State s) -> m () Source #
updateParserState f
applies function f
to the parser state.
Instances
MonadParsec e s m => MonadParsec e s (IdentityT * m) Source # | |
(Monoid w, MonadParsec e s m) => MonadParsec e s (WriterT w m) Source # | |
(Monoid w, MonadParsec e s m) => MonadParsec e s (WriterT w m) Source # | |
MonadParsec e s m => MonadParsec e s (StateT st m) Source # | |
MonadParsec e s m => MonadParsec e s (StateT st m) Source # | |
MonadParsec e s m => MonadParsec e s (ReaderT * st m) Source # | |
(ErrorComponent e, Stream s) => MonadParsec e s (ParsecT e s m) Source # | |
(<?>) :: MonadParsec e s m => m a -> String -> m a infix 0 Source #
A synonym for label
in form of an operator.
unexpected :: MonadParsec e s m => ErrorItem (Token s) -> m a Source #
The parser unexpected item
always fails with an error message telling
about unexpected item item
without consuming any input.
Parser state combinators
getInput :: MonadParsec e s m => m s Source #
Return the current input.
setInput :: MonadParsec e s m => s -> m () Source #
getPosition :: MonadParsec e s m => m SourcePos Source #
Return the current source position.
See also: setPosition
, pushPosition
, popPosition
, and SourcePos
.
setPosition :: MonadParsec e s m => SourcePos -> m () Source #
setPosition pos
sets the current source position to pos
.
See also: getPosition
, pushPosition
, popPosition
, and SourcePos
.
pushPosition :: MonadParsec e s m => SourcePos -> m () Source #
Push given position into stack of positions and continue parsing working with this position. Useful for working with include files and the like.
See also: getPosition
, setPosition
, popPosition
, and SourcePos
.
Since: 5.0.0
popPosition :: MonadParsec e s m => m () Source #
Pop a position from stack of positions unless it only contains one
element (in that case stack of positions remains the same). This is how
to return to previous source file after pushPosition
.
See also: getPosition
, setPosition
, pushPosition
, and SourcePos
.
Since: 5.0.0
getTabWidth :: MonadParsec e s m => m Pos Source #
Return tab width. Default tab width is equal to defaultTabWidth
. You
can set different tab width with help of setTabWidth
.
setTabWidth :: MonadParsec e s m => Pos -> m () Source #
Set tab width. If argument of the function is not positive number,
defaultTabWidth
will be used.
setParserState :: MonadParsec e s m => State s -> m () Source #
setParserState st
set the full parser state to st
.
Running parser
Arguments
:: Parsec e s a | Parser to run |
-> String | Name of source file |
-> s | Input for parser |
-> Either (ParseError (Token s) e) a |
runParser p file input
runs parser p
on the input list of tokens
input
, obtained from source file
. The file
is only used in error
messages and may be the empty string. Returns either a ParseError
(Left
) or a value of type a
(Right
).
parseFromFile p file = runParser p file <$> readFile file
Arguments
:: Monad m | |
=> ParsecT e s m a | Parser to run |
-> String | Name of source file |
-> s | Input for parser |
-> m (Either (ParseError (Token s) e) a) |
runParserT p file input
runs parser p
on the input list of tokens
input
, obtained from source file
. The file
is only used in error
messages and may be the empty string. Returns a computation in the
underlying monad m
that returns either a ParseError
(Left
) or a
value of type a
(Right
).
Arguments
:: Monad m | |
=> ParsecT e s m a | Parser to run |
-> State s | Initial state |
-> m (State s, Either (ParseError (Token s) e) a) |
This function is similar to runParserT
, but like runParser'
it
accepts and returns parser state. This is thus the most general way to
run a parser.
Since: 4.2.0
Arguments
:: Parsec e s a | Parser to run |
-> String | Name of source file |
-> s | Input for parser |
-> Either (ParseError (Token s) e) a |
parse p file input
runs parser p
over Identity
(see runParserT
if you're using the ParsecT
monad transformer; parse
itself is just a
synonym for runParser
). It returns either a ParseError
(Left
) or a
value of type a
(Right
). parseErrorPretty
can be used to turn
ParseError
into the string representation of the error message. See
Text.Megaparsec.Error if you need to do more advanced error analysis.
main = case (parse numbers "" "11,2,43") of Left err -> putStr (parseErrorPretty err) Right xs -> print (sum xs) numbers = integer `sepBy` char ','
parseMaybe :: (ErrorComponent e, Stream s) => Parsec e s a -> s -> Maybe a Source #
parseMaybe p input
runs parser p
on input
and returns result
inside Just
on success and Nothing
on failure. This function also
parses eof
, so if the parser doesn't consume all of its input, it will
fail.
The function is supposed to be useful for lightweight parsing, where error messages (and thus file name) are not important and entire input should be parsed. For example it can be used when parsing of single number according to specification of its format is desired.