Extending the parser¶
Modules such as page3 extend the CSS 2.1 parser to add support for
CSS 3 syntax.
They do so by sub-classing css21.CSS21Parser and overriding/extending
some of its methods. If fact, the parser is made of methods in a class
(rather than a set of functions) solely to enable this kind of sub-classing.
tinycss is designed to enable you to have parser subclasses outside of tinycss, without monkey-patching. If however the syntax you added is for a W3C specification, consider including your subclass in a new tinycss module and send a pull request: see Hacking tinycss.
Example: star hack¶
The star hack uses invalid declarations that are only parsed by some versions of Internet Explorer. By default, tinycss ignores invalid declarations and logs an error.
>>> from tinycss.css21 import CSS21Parser
>>> css = '#elem { width: [W3C Model Width]; *width: [BorderBox Model]; }'
>>> stylesheet = CSS21Parser().parse_stylesheet(css)
>>> stylesheet.errors
[ParseError('Parse error at 1:35, expected a property name, got DELIM',)]
>>> [decl.name for decl in stylesheet.rules[0].declarations]
['width']
If for example a minifier based on tinycss wants to support the star hack, it can by extending the parser:
>>> class CSSStarHackParser(CSS21Parser):
... def parse_declaration(self, tokens):
... has_star_hack = (tokens[0].type == 'DELIM' and tokens[0].value == '*')
... if has_star_hack:
... tokens = tokens[1:]
... declaration = super(CSSStarHackParser, self).parse_declaration(tokens)
... declaration.has_star_hack = has_star_hack
... return declaration
...
>>> stylesheet = CSSStarHackParser().parse_stylesheet(css)
>>> stylesheet.errors
[]
>>> [(d.name, d.has_star_hack) for d in stylesheet.rules[0].declarations]
[('width', False), ('width', True)]
This class extends the parse_declaration() method.
It removes any * delimeter Token at the start of
a declaration, and adds a has_star_hack boolean attribute on parsed
Declaration objects: True if a * was removed, False for
“normal” declarations.
Parser methods¶
In addition to methods of the user API (see Parsing a stylesheet), here are the methods of the CSS 2.1 parser that can be overriden or extended:
- CSS21Parser.parse_rules(tokens, context)[source]¶
Parse a sequence of rules (rulesets and at-rules).
- Parameters
tokens – An iterable of tokens.
context – Either
'stylesheet'or an at-keyword such as'@media'. (Most at-rules are only allowed in some contexts.)
- Returns
A tuple of a list of parsed rules and a list of
ParseError.
- CSS21Parser.read_at_rule(at_keyword_token, tokens)[source]¶
Read an at-rule from a token stream.
- Parameters
at_keyword_token – The ATKEYWORD token that starts this at-rule You may have read it already to distinguish the rule from a ruleset.
tokens – An iterator of subsequent tokens. Will be consumed just enough for one at-rule.
- Returns
An unparsed
AtRule.- Raises
ParseErrorif the head is invalid for the core grammar. The body is not validated. SeeAtRule.
- CSS21Parser.parse_at_rule(rule, previous_rules, errors, context)[source]¶
Parse an at-rule.
Subclasses that override this method must use
super()and pass its return value for at-rules they do not know.In CSS 2.1, this method handles @charset, @import, @media and @page rules.
- Parameters
rule – An unparsed
AtRule.previous_rules – The list of at-rules and rulesets that have been parsed so far in this context. This list can be used to decide if the current rule is valid. (For example, @import rules are only allowed before anything but a @charset rule.)
context – Either
'stylesheet'or an at-keyword such as'@media'. (Most at-rules are only allowed in some contexts.)
- Raises
ParseErrorif the rule is invalid.- Returns
A parsed at-rule
- CSS21Parser.parse_media(tokens)[source]¶
For CSS 2.1, parse a list of media types.
Media Queries are expected to override this.
- Parameters
tokens – A list of tokens
- Raises
ParseErroron invalid media types/queries- Returns
For CSS 2.1, a list of media types as strings
- CSS21Parser.parse_page_selector(tokens)[source]¶
Parse an @page selector.
- Parameters
tokens – An iterable of token, typically from the
headattribute of an unparsedAtRule.- Returns
A page selector. For CSS 2.1, this is
'first','left','right'orNone.- Raises
ParseErroron invalid selectors
- CSS21Parser.parse_declarations_and_at_rules(tokens, context)[source]¶
Parse a mixed list of declarations and at rules, as found eg. in the body of an @page rule.
Note that to add supported at-rules inside @page,
CSSPage3Parserextendsparse_at_rule(), not this method.- Parameters
tokens – An iterable of token, typically from the
bodyattribute of an unparsedAtRule.context – An at-keyword such as
'@page'. (Most at-rules are only allowed in some contexts.)
- Returns
A tuple of:
A list of
DeclarationA list of parsed at-rules (empty for CSS 2.1)
A list of
ParseError
- CSS21Parser.parse_ruleset(first_token, tokens)[source]¶
Parse a ruleset: a selector followed by declaration block.
- Parameters
first_token – The first token of the ruleset (probably of the selector). You may have read it already to distinguish the rule from an at-rule.
tokens – an iterator of subsequent tokens. Will be consumed just enough for one ruleset.
- Returns
a tuple of a
RuleSetand an error list. The errors are recoveredParseErrorin declarations. (Parsing continues from the next declaration on such errors.)- Raises
ParseErrorif the selector is invalid for the core grammar. Note a that a selector can be valid for the core grammar but not for CSS 2.1 or another level.
- CSS21Parser.parse_declaration_list(tokens)[source]¶
Parse a
;separated declaration list.You may want to use
parse_declarations_and_at_rules()(or some other method that usesparse_declaration()directly) instead if you have not just declarations in the same context.- Parameters
tokens – an iterable of tokens. Should stop at (before) the end of the block, as marked by
}.- Returns
a tuple of the list of valid
Declarationand a list ofParseError
- CSS21Parser.parse_declaration(tokens)[source]¶
Parse a single declaration.
- Parameters
tokens – an iterable of at least one token. Should stop at (before) the end of the declaration, as marked by a
;or}. Empty declarations (ie. consecutive;with only white space in-between) should be skipped earlier and not passed to this method.- Returns
- Raises
ParseErrorif the tokens do not match the ‘declaration’ production of the core grammar.
Unparsed at-rules¶
- class tinycss.css21.AtRule(at_keyword, head, body, line, column)[source]¶
An unparsed at-rule.
- at_keyword¶
The normalized (lower-case) at-keyword as a string. Eg:
'@page'
- head¶
The part of the at-rule between the at-keyword and the
{marking the body, or the;marking the end of an at-rule without a body. ATokenList.
- body¶
The content of the body between
{and}as aTokenList, orNoneif there is no body (ie. if the rule ends with;).
The head was validated against the core grammar but not the body, as the body might contain declarations. In case of an error in a declaration, parsing should continue from the next declaration. The whole rule should not be ignored as it would be for an error in the head.
These at-rules are expected to be parsed further before reaching the user API.
Parsing helper functions¶
The tinycss.parsing module contains helper functions for parsing
tokens into a more structured form:
- tinycss.parsing.strip_whitespace(tokens)[source]¶
Remove whitespace at the beggining and end of a token list.
Whitespace tokens in-between other tokens in the list are preserved.
- Parameters
tokens – A list of
TokenorContainerToken.- Returns
A new sub-sequence of the list.
- tinycss.parsing.split_on_comma(tokens)[source]¶
Split a list of tokens on commas, ie
,DELIM tokens.Only “top-level” comma tokens are splitting points, not commas inside a function or other
ContainerToken.- Parameters
tokens – An iterable of
TokenorContainerToken.- Returns
A list of lists of tokens
- tinycss.parsing.validate_value(tokens)[source]¶
Validate a property value.
- Parameters
tokens – an iterable of tokens
- Raises
ParseErrorif there is any invalid token for the ‘value’ production of the core grammar.
- tinycss.parsing.validate_block(tokens, context)[source]¶
- Raises
ParseErrorif there is any invalid token for the ‘block’ production of the core grammar.- Parameters
tokens – an iterable of tokens
context – a string for the ‘unexpected in …’ message
- tinycss.parsing.validate_any(token, context)[source]¶
- Raises
ParseErrorif this is an invalid token for the ‘any’ production of the core grammar.- Parameters
token – a single token
context – a string for the ‘unexpected in …’ message