freeports_analysis.formats.algorithms.unstructured.mediolanum_es24_b

MEDIOLANUM_ES24_B format submodule.

This module provides processing functions for the MEDIOLANUM_ES24_B format, which handles Spanish financial documents with specific layout characteristics.

Functions

deserialize(txt_blk)

Deserialize text blocks into structured data for MEDIOLANUM_ES24_B format.

pdf_filter(xml_root)

Filter PDF content for MEDIOLANUM_ES24_B format.

text_extract(pdf_blocks, targets)

Extract text content from PDF blocks for MEDIOLANUM_ES24_B format.

Classes

PdfBlockType(*values)

Types of PDF blocks for MEDIOLANUM_ES24_B format.

TextBlockType(*values)

Types of text blocks for MEDIOLANUM_ES24_B format.

class freeports_analysis.formats.algorithms.unstructured.mediolanum_es24_b.PdfBlockType(*values)

Types of PDF blocks for MEDIOLANUM_ES24_B format.

class freeports_analysis.formats.algorithms.unstructured.mediolanum_es24_b.TextBlockType(*values)

Types of text blocks for MEDIOLANUM_ES24_B format.

freeports_analysis.formats.algorithms.unstructured.mediolanum_es24_b.deserialize(txt_blk: TextBlock | None) Any | None

Deserialize text blocks into structured data for MEDIOLANUM_ES24_B format.

Parameters:

txt_blk (Optional[TextBlock]) – Text block to deserialize, or None

Returns:

Deserialized data object or None if input is None

Return type:

Optional[Any]

Notes

Handles subfund context resolution and applies specific scaling to market values (multiplies by 1000).

freeports_analysis.formats.algorithms.unstructured.mediolanum_es24_b.pdf_filter(xml_root: Element) List[PdfBlock]

Filter PDF content for MEDIOLANUM_ES24_B format.

This function processes XML content from PDF to extract relevant blocks, handling both subfund information and standard financial data.

Parameters:

xml_root (etree.Element) – Root element of the PDF XML content

Returns:

List of extracted PDF blocks with their types and metadata

Return type:

List[PdfBlock]

Notes

The function first checks for specific Spanish regulatory markers (CNMV), then falls back to standard PDF filtering for financial data extraction.

freeports_analysis.formats.algorithms.unstructured.mediolanum_es24_b.text_extract(pdf_blocks: List[PdfBlock], targets: List[str]) List[TextBlock]

Extract text content from PDF blocks for MEDIOLANUM_ES24_B format.

Parameters:
  • pdf_blocks (List[PdfBlock]) – List of PDF blocks to extract text from

  • targets (List[str]) – List of target identifiers for text extraction

Returns:

List of extracted text blocks with their types and metadata

Return type:

List[TextBlock]

Notes

Handles both subfund information extraction and standard financial data extraction with specific column positions.