Skip to content

Commit

Permalink
initial docs for the Block Builder
Browse files Browse the repository at this point in the history
update
  • Loading branch information
nickmoreton committed Oct 28, 2021
1 parent 16db684 commit d882e43
Show file tree
Hide file tree
Showing 3 changed files with 252 additions and 4 deletions.
251 changes: 251 additions & 0 deletions docs/blockbuilder.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,251 @@
# Block Builder

- [Block Builder](#block-builder)
- [Included Blocks](#included-blocks)
- [Heading Block `<h1>, <h2>, <h3>, <h4>, <h5>, <h6>`](#heading-block-h1-h2-h3-h4-h5-h6)
- [Table Block `<table>`](#table-block-table)
- [Iframe Block `<iframe>`](#iframe-block-iframe)
- [Form Block `<form>`](#form-block-form)
- [Image Block `<img />` `TODO not yet complete, likely to come from shortcode parsing`](#image-block-img--todo-not-yet-complete-likely-to-come-from-shortcode-parsing)
- [Blockquote Block `<blockquote>`](#blockquote-block-blockquote)
- [Included Fallback/Catch-all Block](#included-fallbackcatch-all-block)
- [Configuration](#configuration)

The block builder takes the page body content in as a string of HTML.
The filters listed below are then used to parse the HTML into a sequence of StreamField blocks.

The parsing process uses Beautiful Soup to analyze each top level HTML tag in the order they appear in the HTML body content. If a match is found in the `WAGTAIL_WORDPRESS_IMPORTER_CONVERT_HTML_TAGS_TO_BLOCKS` configuration a single block is created for the HTML tag.

---

## Included Blocks

#### Heading Block `<h1>, <h2>, <h3>, <h4>, <h5>, <h6>`

Builder:

```python
def build_heading_block(tag):
block_dict = {
"type": "heading",
"value": {"importance": tag.name, "text": tag.text},
}
return block_dict
```

Wagtail Block:

```python
class HeadingBlock(blocks.StructBlock):
text = blocks.CharBlock(classname="title")
importance = blocks.ChoiceBlock(
choices=(
("h1", "H1"),
("h2", "H2"),
("h3", "H3"),
("h4", "H4"),
("h5", "H5"),
("h6", "H6"),
),
default="h1",
)

class Meta:
icon = "title"
template = "wagtail_wordpress_import/heading_block.html"
```

#### Table Block `<table>`
Filter:
```python
def build_table_block(tag):
block_dict = {"type": "raw_html", "value": str(tag)}
return block_dict
```
Wagtail Block:
```
blocks.RawHTMLBlock()
```

#### Iframe Block `<iframe>`
Filter:
```python
def build_iframe_block(tag):
block_dict = {
"type": "raw_html",
"value": '<div class="core-custom"><div class="responsive-iframe">{}</div></div>'.format(
str(tag)
),
}
return block_dict
```
Wagtail Block:
```python
blocks.RawHTMLBlock()
```

#### Form Block `<form>`
Filter:
```python
def build_form_block(tag):
block_dict = {"type": "raw_html", "value": str(tag)}
return block_dict
```
Wagtail Block:
```python
blocks.RawHTMLBlock()
```

#### Image Block `<img />` `TODO not yet complete, likely to come from shortcode parsing`

Filter: `TODO not yet complete
```python
def build_image_block(tag):
def get_image_id(src):
return 1

block_dict = {"type": "image", "value": get_image_id(tag.src)}
return block_dict
```
Wagtail Block
```python
class ImageBlock(blocks.StructBlock):
image = ImageChooserBlock()
caption = blocks.CharBlock(required=False)

class Meta:
icon = "image"
template = "wagtail_wordpress_import/image_block.html"
```

#### Blockquote Block `<blockquote>`
Filter:
```python
def build_block_quote_block(tag):
block_dict = {
"type": "block_quote",
"value": {"quote": tag.text.strip(), "attribution": tag.cite},
}
return block_dict
```
Wagtail Block:
```python
class QuoteBlock(blocks.StructBlock):
quote = blocks.CharBlock(form_classname="title")
attribution = blocks.CharBlock(required=False)

class Meta:
icon = "openquote"
template = "wagtail_wordpress_import/quote_block.html"
```

## Included Fallback/Catch-all Block

By default, the fallback block is a Wagtail `RichText` Block.

Only content that has no specific block filter is added to the fall back block.

Example: `<p> <ul> <a> <img /> ...`

This block is only saved to the block sequence each time the builder finds a new Block is required or the builder has reached the end of the content parsing.

This block has extra processing included each time it is saved as a block to the block sequence.

1. All the `<img />` src values are analyzed and if the image is a local to site image it is fetched and saved to the Wagtail Images app. The `<img />` tags are updated to the Wagtail rich text embedded content type. e.g. `<embed embedtype="image" id="1001" alt="A image description" format="left" />`
2. All the `<a href="..."></a>` href values are analyzed and if the href is a document type it is fetched and saved to the Wagtail Documents app. The `<a href=""></a>` are updated to the Wagtail RichText linktype format. e.g. `<a id="1001" linktype="document">link</a>`

Filter:

```python
def build_none_block_content(cache, blocks):
"""
image_linker is called to link up and retrive the remote image
"""
cache = image_linker(cache)
cache = document_linker(cache)
blocks.append({"type": "rich_text", "value": cache})
cache = ""
return cache
```
Wagtail Block

```python
rich_text = blocks.RichTextBlock(
# "h1","h2","h3","h4","h5","h6","image","embed",
# are included to allow editing the content via the admin once the import is complete
#they are used while the body content is parsed into blocks.
features=[
"anchor-identifier",
"h1",
"h2",
"h3",
"h4",
"h5",
"h6",
"bold",
"italic",
"ol",
"ul",
"hr",
"link",
"document-link",
"image",
"embed",
"superscript",
"subscript",
"strikethrough",
"blockquote",
]
)
```

---

## Configuration

You can add your own configuration to control the Block Building process.

Below is the included configuration. You can copy this to your own settings and either add or remove tag to block filters.

```python
WAGTAIL_WORDPRESS_IMPORTER_CONVERT_HTML_TAGS_TO_BLOCKS =
[
(
"h1",{
"FUNCTION": "wagtail_wordpress_import.block_builder_defaults.build_heading_block",},),
(
"h2",{
"FUNCTION": "wagtail_wordpress_import.block_builder_defaults.build_heading_block",},),
(
"h3",{
"FUNCTION": "wagtail_wordpress_import.block_builder_defaults.build_heading_block",},),
(
"h4",{
"FUNCTION": "wagtail_wordpress_import.block_builder_defaults.build_heading_block",},),
(
"h5",{
"FUNCTION": "wagtail_wordpress_import.block_builder_defaults.build_heading_block",},),
(
"h6",{
"FUNCTION": "wagtail_wordpress_import.block_builder_defaults.build_heading_block",},),
(
"table",{
"FUNCTION": "wagtail_wordpress_import.block_builder_defaults.build_table_block",},),
(
"iframe",{
"FUNCTION": "wagtail_wordpress_import.block_builder_defaults.build_iframe_block",},),
(
"form",{
"FUNCTION": "wagtail_wordpress_import.block_builder_defaults.build_form_block",},),
(
"img",{
"FUNCTION": "wagtail_wordpress_import.block_builder_defaults.build_image_block",},),
(
"blockquote",{
"FUNCTION": "wagtail_wordpress_import.block_builder_defaults.build_block_quote_bloc",},),
]
```
Examples:

1. Include the `h1-h6` HTML tags in the fall back block and not have their own block types. Just remove the `h1-h6` filter configuration items in your own settings.
2. Add extra HTML tag processing: you would add a function somewhere in your own Wagtail app. Then add an item to the config above with the HTML tag key along with a `FUNCTION` which is the dotted path to the function you have created. You may also need to include the Wagtail Block in your own app or you could repurpose one of the provided Block types.
2 changes: 0 additions & 2 deletions wagtail_wordpress_import/block_builder_defaults.py
Original file line number Diff line number Diff line change
Expand Up @@ -356,8 +356,6 @@ def document_linker(html):
document_embed.attrs["linktype"] = "document"
document_embed.attrs["id"] = saved_document.id
document_embed.string = anchor_inner_content
# image_embed.attrs["alt"] = get_image_alt(image)
# image_embed.attrs["format"] = get_alignment_class(image)
anchor.replace_with(document_embed)
else:
print(f"DOCUMENT HAS NO HREF: {anchor}")
Expand Down
3 changes: 1 addition & 2 deletions wagtail_wordpress_import/test/tests/test_block_builder.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
import os

from bs4 import BeautifulSoup
from django.conf import settings
from django.test import TestCase, override_settings, modify_settings
from django.test import TestCase, override_settings
from wagtail_wordpress_import.block_builder import BlockBuilder
from wagtail_wordpress_import.block_builder_defaults import (
build_block_quote_block,
Expand Down

0 comments on commit d882e43

Please sign in to comment.