This is a repository for both kds-team.ex-email-content and kds-team.ex-ms-outlook-email-content. kds-team.ex-ms-outlook-email-content exists to support Office 365 version of MS Outlook.
This component allows you to extract email body and other metadata using IMAP protocol.
Table of contents:
[TOC]
NOTE: The default authority https://login.microsoftonline.com/common can be overwritten by authority parameter in image_parameters.
Have IMAP service enabled on your Email account. Please refer to your email provider for more information.
Note that for GMAIL you will need to use App Password or alternatively (not recommended) enable access for the "less secure" apps.
For MS Outlook in Office 365 suite, you will need to grant permission using oAuth.
Note that the app fetches emails from the root INBOX folder. If you use labels and filters in Gmail for instance, that move the messages to a different folder,
please set the imap_folder configuration parameter.
| Feature | Note |
|---|---|
| Generic UI form | Dynamic UI form |
| Row based configuration | Allows execution of each row in parallel. |
| Incremental loading | Allows fetching data in new increments. |
| IMAP query syntax | Filter emails using standard IMAP query |
| Download email contents | Full body of email downloaded into the Storage column |
| Download email attachments | All attachments downloaded by default into a file storage. |
| Filter email attachments | Download only attachments matching specified regex expression |
| Processors support | Use processor to modify the outputs before saving to storage, e.g. process attachments to be stored in the Tabular Storage |
connection_method--imap(default) orgraph_api(MS Outlook variant only)#password-- IMAP password; not needed forgraph_apiconnection methoduser_name-- login / email addresshost-- IMAP host (IMAP only)query-- IMAP search query. E.g.(FROM "email" SUBJECT "the subject" UNSEEN). More information here. IMAP only.graph_filter-- OData$filterexpression for Graph API exact matching. Can be combined withdate_since. Graph API only.graph_search-- KQL search expression for Graph API keyword/partial matching. Cannot be combined withgraph_filterordate_since. Graph API only.imap_folder-- Folder to get the emails from. Defaults toINBOX. For IMAP: folder path. For Graph API: well-known name (inbox, sentitems, etc.) or display name.date_since-- Date in YYYY-MM-DD format or dateparser string (e.g.5 days ago). Cannot be combined withgraph_search.download_content-- (boolean) if true, content of the email will be downloaded into theout/tables/emails.csvtablemark_seen-- (boolean) When set to true, emails that have been extracted will be marked as seen in the inbox.download_attachments-- (boolean) if true, attachments of the email will be downloaded intoout/files/folder, prefixed by generated emailpk.attachment_pattern-- (str) Applicable only withdownload_attachments:true. Regex pattern to filter particular attachments. e.g. to retrieve only pdf file types use: .+.pdf
IMAP search query. E.g. (FROM "email" SUBJECT "the subject" UNSEEN)
More information on keywords here
OData $filter expression for exact matching. Can be combined with date_since.
| Need | Example |
|---|---|
| Exact sender | from/emailAddress/address eq 'someone@example.com' |
| Exact subject | subject eq 'Exact Subject Line' |
| Has attachments | hasAttachments eq true |
| Combined | from/emailAddress/address eq 'x@y.com' and subject eq 'text' |
Note: contains() on subject/body is not supported by the messages endpoint and returns HTTP 400.
KQL search expression for keyword and partial matching. Cannot be combined with graph_filter or date_since — this is a Microsoft Graph API limitation. Returns up to 1,000 results.
| Need | Example | Notes |
|---|---|---|
| Subject keyword | subject:weekly |
substring match |
| Subject exact phrase | subject:"exact multi-word phrase" |
use double quotes |
| Sender full or partial | from:someone@example.com or from:MSSecurity |
substring match |
| Date exact | received:2026-03-17 |
single day |
| Date range | received:2026-01-01..2026-01-31 |
inclusive, .. syntax only |
| All combined | from:sender subject:keyword received:2026-01-01..2026-03-18 |
space-separated = AND |
Note: relative date keywords (received:this week, received:today) are not supported on the messages endpoint and are treated as free-text.
| Use case | Field | Combinable with date_since? |
|---|---|---|
| Exact sender / exact subject | graph_filter |
Yes |
| Keyword / partial matching | graph_search |
No — use received: in KQL instead |
| Keyword + date range | graph_search with received:YYYY-MM-DD..YYYY-MM-DD |
N/A (date is in KQL) |
| Date only | date_since |
N/A |
{
"#password": "xxxxx",
"user_name": "example@gmail.com",
"host": "imap.gmail.com",
"port": 993,
"query":"(FROM "email" SUBJECT "the subject" UNSEEN)",
"download_content": true,
"download_attachments": true,
"attachment_pattern": ".+\\.pdf"
}
Single table named emails.
Columns: ['pk', 'uid', 'mail_box', 'date', 'from', 'to', 'body', 'headers', 'number_of_attachments', 'size']
Attachments in out/files/ prefixed by the generated message pk. e.g. out/files/bb41793268d4a8710fb5ebd94eaed6bc_some_file.pdf
If required, change local data folder (the CUSTOM_FOLDER placeholder) path to
your custom path in the docker-compose file:
volumes:
- ./:/code
- ./CUSTOM_FOLDER:/data
Clone this repository, init the workspace and run the component with following command:
git clone https://bitbucket.org/kds_consulting_team/kds-team.ex-email-content.git
cd kds-team.ex-email-content
docker-compose build
docker-compose run --rm dev
Run the test suite and lint check using this command:
docker-compose run --rm test
For information about deployment and integration with KBC, please refer to the deployment section of developers documentation