How to use the command xmllint (with examples)
xmllint is a command-line tool that can be used to parse and navigate XML files. It supports XPath, a syntax for querying and traversing XML trees. This article provides examples of different use cases for the xmllint command.
Use case 1: Return all nodes (tags) named “foo”
Code:
xmllint --xpath "//foo" source_file.xml
Motivation: This use case is helpful when we want to extract all the nodes with a specific tag name from an XML file.
Explanation:
--xpath
specifies that we want to use the XPath expression to query the XML."//foo"
is the XPath expression that selects all the nodes named “foo”.source_file.xml
is the path to the XML file.
Example output:
<foo>...</foo>
<foo>...</foo>
...
Use case 2: Return the contents of the first node named “foo” as a string
Code:
xmllint --xpath "string(//foo)" source_file.xml
Motivation: This use case is useful when we need to extract the inner text of a specific node in an XML file.
Explanation:
--xpath
specifies that we want to use the XPath expression to query the XML."string(//foo)"
is the XPath expression that selects the first node named “foo” and returns its contents as a string.source_file.xml
is the path to the XML file.
Example output:
foo content
Use case 3: Return the href attribute of the second anchor element in an HTML file
Code:
xmllint --html --xpath "string(//a[2]/@href)" webpage.xhtml
Motivation: This use case is valuable when we want to extract a specific attribute value from an HTML file.
Explanation:
--html
specifies that the input file is an HTML file.--xpath
specifies that we want to use the XPath expression to query the HTML."string(//a[2]/@href)"
is the XPath expression that selects the second anchor element (//a[2]
) and returns the value of its href attribute (/@href
).webpage.xhtml
is the path to the HTML file.
Example output:
https://www.example.com/link2
Use case 4: Return human-readable (indented) XML from file
Code:
xmllint --format source_file.xml
Motivation: This use case is useful when we want to format an XML file for better readability.
Explanation:
--format
instructs xmllint to format the XML file in human-readable form.source_file.xml
is the path to the XML file.
Example output:
<root>
<element>
...
</element>
...
</root>
Use case 5: Check that an XML file meets the requirements of its DOCTYPE declaration
Code:
xmllint --valid source_file.xml
Motivation: This use case is essential to verify if an XML file is valid according to its DOCTYPE declaration.
Explanation:
--valid
tells xmllint to validate the XML file against its DOCTYPE declaration.source_file.xml
is the path to the XML file.
Example output:
source_file.xml validates
Use case 6: Validate XML against DTD schema hosted online
Code:
xmllint --dtdvalid URL source_file.xml
Motivation: This use case is crucial when we want to validate XML against a DTD schema hosted online.
Explanation:
--dtdvalid URL
instructs xmllint to validate the XML file against the DTD schema specified by the provided URL.URL
is the URL of the DTD schema.source_file.xml
is the path to the XML file.
Example output:
source_file.xml validates against URL
Conclusion:
The xmllint command is a versatile tool for parsing and manipulating XML files. With its support for XPath expressions, it can perform various operations such as querying nodes, extracting values, and validating XML against schemas. By understanding the different use cases illustrated in this article, users can effectively utilize xmllint for their XML processing needs.