Skip to content

Mapping Table node to <table> tag in bodyXML and vice versa #71

@epavlova

Description

@epavlova

Background

There are differences between the Table content tree node and the table tag in the bodyXML representation. Potentially, they will be troublesome for the transforming the bodyXML to the content tree format and vice versa.

Example article with table "The people vs Elon Musk: billionaire transforms Wisconsin court contest", Mar 2025:

<table class="data-table" data-table-collapse-rownum="" data-table-layout-largescreen="auto"
       data-table-layout-smallscreen="auto" data-table-theme="auto">
    <caption>
        Largest spending groups in Wisconsin supreme court election
    </caption>
    <thead>
        <tr>
            <th data-column-hidden="none" data-column-sortable="false" data-column-type="string">Group
            </th>
            <th data-column-hidden="none" data-column-sortable="false" data-column-type="string">Amount spent
            </th>
            <th data-column-hidden="none" data-column-sortable="false" data-column-type="string">Preferred candidate
            </th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>Susan Crawford (candidate)</td>
            <td>$22,042,953</td>
            <td>Crawford</td>
        </tr>
       ...
        <tr>
            <td>Americans for Prosperity</td>
            <td>$3,172,910</td>
            <td>Schimel</td>
        </tr>
    </tbody>
    <tfoot>
        <tr>
            <td colspan="1000">Through March 26; Source: Brennan Center for Justice, Wisconsin Ethics Commission,
                Vivvix/CMAG
            </td>
        </tr>
    </tfoot>
</table>

<th> tags

The bodyXML representation of <table> tag support 4 different types of children - caption, head, body and footer. The content tree element doesn't have an element corresponding to head.

The table head has the following definition in XHTML format:

<xs:complexType name="Th">
        <xs:complexContent>
            <xs:extension base="Td">
                <xs:attribute name="data-column-type" type="xs:string"/>
                <xs:attribute name="data-column-sortable" type="xs:boolean"/>
                <xs:attribute name="data-column-default-sort" type="xs:string"/>
                <xs:attribute name="data-column-hidden" type="xs:string"/>
            </xs:extension>
        </xs:complexContent>
    </xs:complexType>

<table> attributes

The bodyXML representation of table tag support the following attributes:

<xs:attribute name="class" type="xs:string" fixed="data-table" use="required"/>
<xs:attribute name="id" type="xs:string"/>
<xs:attribute name="data-table-layout-largescreen" type="xs:string"/>
<xs:attribute name="data-name" type="xs:string"/>
<xs:attribute name="data-table-layout-smallscreen" type="xs:string"/>
<xs:attribute name="data-table-theme" type="xs:string"/>
<xs:attribute name="data-table-collapse-rownum" type="xs:string"/>

They don't quite match the attributes of the Table content tree node.

interface Table extends Parent {
	type: 'table'
	stripes: boolean
	compact: boolean
	layoutWidth:
		| 'auto'
		| 'full-grid'
		| 'inset-left'
		| 'inset-right'
		| 'full-bleed'
	collapseAfterHowManyRows?: number
	responsiveStyle: 'overflow' | 'flat' | 'scroll'
	children: [TableCaption, TableBody, TableFooter] | [TableCaption, TableBody] | [TableBody, TableFooter] | [TableBody]
	columnSettings: TableColumnSettings[]
}

<td> attributes

The <td> tags support colspan attribute which doesn't have corresponding attribute in the TableCell content tree node.

    <xs:attribute name="colspan" type="xs:string"/>

TableCell attributes

The TableCell node supports heading attribute which is not present in the bodyXML representation of the td tag.

interface TableCell extends Parent {
   type: 'table-cell'
	heading?: boolean
	children: Phrasing[]
}

Possible approaches

Most probably all of those are worth a discussion before any approaches are suggested.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions