Category: Mindmapping

Breaking A Large Mindmap Into Smaller Mindmaps For Analysis (XMind)

This post expands on previous posts (see summary here) regarding the use of XMind for modelling data with openEHR. The aim is to document our processes in an open and reproducible manner.
Image Courtesy of Lesekreis (Own work) [CC0], via Wikimedia Commons

 

In the first step of our project, we collated all the different datapoints required for different registries and operational systems (we shall call each registry or system a “section” of the mindmap from now on). Many datapoints were duplicated across different sections, therefore we cross-referenced them to identify the overlap. The next step is to use that information to identify the unique datapoints we need to capture. As the main mindmap is usually very large and cumbersome at this point, we break this up into separate sub-mindmaps for analysis. This can be challenging as:

  • X-Mind does not support synchronisation between different mindmaps. Therefore if you make a change to any datapoint, you must ensure that the change is made to every mindmap in which the datapoint exists.
  • Multiple team members may be working on the mindmaps. Therefore if changes need to be made to the mindmaps, there should ideally be one person overseeing the changes to prevent multiple forks of the mindmaps.
  • We should ideally ensure that every datapoint has been copied across to the sub-mindmaps at the end of this process, which can be tricky when there are hundreds of datapoints. To facilitate this, we use a markerScreen Shot 2016-03-11 at 17.24.26 – aka “Modelling has started”) on each datapoint of the main mindmap, when it has been copied over to a sub-mindmap. Towards the end of the modelling process we can search the main mindmap to look for any datapoints which have not been marked in this way, and therefore are yet to be modelled.

 

After much discussion we agreed the following rules for breaking down the main mindmap into the sub-mindmaps.

  1. Decide on the initial categories for the sub-mindmaps e.g. Diagnosis, Pathology, Imaging, Demographics.

  2. Copy across the relevant datapoints to the sub-mindmap. It is useful to keep the datapoints from different sections on the main mindmap (eg COSD, Genomics), separated under similarly-named sections on the sub-mindmap.
  3. In the main mindmap (not the sub-mindmap), place a marker ( Screen Shot 2016-03-11 at 17.24.26 – aka “Modelling has started”) on each datapoint of the main mindmap, when it has been copied over to a sub-mindmap.
  4. When we believe we have copied across all the relevant items to a sub-mindmap, we can begin the analysis step.
    Begin from the top of the sub-mindmap and examine each datapoint. Using the labels and notes we can identify the duplicate datapoints that should have also been transferred from the other sections.
    If one of the duplicate datapoints has not been transferred to the sub-mindmap, copy it across at this point and make sure it is markedScreen Shot 2016-03-11 at 17.24.26 – aka “Modelling has started”) on the main mindmap.We then need to remove all the duplicates from the sub-mindmap, so we are only dealing with unique datapoints.
    It is good practice to nominate one section as your base dataset (eg COSD in cancer) – these datapoints will be preferentially preserved, and duplicates from other datasets removed. This will make it slightly easier to keep track of all the datapoints in the future.

  5. We should be aware that there may have been errors in data synchronisation during the previous steps of mindmapping. Therefore we are very careful when we delete duplicate datapoints. We should examine all the labels and notes of all the duplicate datapoints, and make sure all unique information is copied across to the datapoint we keep. This information (eg mappings to datapoints in different datasets, ordinality, issues to be resolved etc) is very useful for archetyping and future data mapping, so it is vital that it is not inadvertently lost.
    The notes should include all the registry codes to which the datapoint maps, including the name of the parent datapoint. eg in the COSD section, the datapoint name itself will contain the COSD code, but we should ensure that the attached notes also list the COSD code, for completeness.

  6. Once we have processed a datapoint as above (consolidated information from, and then deleted the duplicate datapoints), we mark the datapoint with a blue flag ( Screen Shot 2016-03-11 at 17.32.35  )to show that we have completed the analysis phase.
    But if there are any questions to resolve (eg regarding possible mappings, or perhaps even possible errors detected) we mark it with a red flag ( Screen Shot 2016-03-11 at 17.32.43 ).

  7. If we are unsure whether one of the duplicate fields is truly a duplicate, then do not delete it, and also mark it with a red flag ( Screen Shot 2016-03-11 at 17.32.43 ).  Eg if there are 4 definitely identical datapoints, and 2 further possible (but not definite) duplicates:
    – Keep one of the “definites” and the 2 “possibles”. Mark these with red flags.
    – Delete the other 3 datapoints (the redundant “definites”).

  8. After you have formed your main sub-mindmaps, go back and check if all datapoints on the main mindmap have a Screen Shot 2016-03-11 at 17.24.26 marker. If not, they have not yet been processed.
    If this unprocessed datapoint does not fit any of the categories of the existing sub-mindmaps, a new sub-mindmap may be needed to house these orphan datapoints (e.g. called “Miscellaneous”).

 

This approach has helped us to begin tackling the large datasets. However minor problems will undoubtedly arise, which will be documented below as they arise:

  • Some datapoints from different registries refer to the same data, but at different time points (eg “PSA at diagnosis” vs “PSA at start of treatment”). We would expect our longitudinal data model to handle both datapoints, using the same archetype. We should treat these as duplicate datapoints, but this timing information should be recorded in the notes section, as it will be useful for future mappings. We retain only one of the datapoints, but mark it with a red flag ( Screen Shot 2016-03-11 at 17.32.43 ) to indicate that the mapping will not be straightforward.
  • Some event metadata for items (eg “Event Date” in Genomics) can be difficult to map to other data points. We would expect this type of metadata to be handled by the reference model of openEHR, so we don’t dwell too long on this.

openEHR Modelling with XMind – Part 3 (Summary)

Part 3 is a quick reference summary of the more detailed descriptions in Part 1 and Part 2.

In this series of posts we document how we began the process of modelling our data using mindmaps. The aim is to document our processes in an open and reproducible manner.

 

Summary

  • Mindmap nodes:
    • The registry code for a data point is documented within the name of the node – eg “CR0520 – T category (final pretreatment)”.
    • The data point name is the often lowest level node in the tree – ie there are no further “daughter nodes” beneath the data point name.
      • However some modellers list the constrained value set of the data point using daughter nodes, grouped together with a boundary box.

 

  • Labels:
    • These are used as tags for searching. There can only be one label per node, but this can contain multiple tags.
    • Tags should be one word long (but words can be concatenated eg “COSDcore”) and separated by commas.
    • Document the different datasets / registries in which the datapoint resides via these labels.
      • Eg if a datapoint is in both “COSD core” and “Genomics”, it would be listed under both parent nodes (with the correct corresponding field code in each section), and would be tagged with both “COSDcore” and “Genomics” each time – ie “COSDcore, Genomics”.
    • Also document the chosen archetype with a label corresponding to the archetype name – eg “CLUSTER.tnm_staging_7th-prostate.v1”.

 

  • Notes:
    • Used for longer text notes about the data point.
    • We recommend that you put any codes from other data points in this section (eg “CR0620 in COSD is equivalent to 14944.1 in Genomics”).
    • You may also choose to document information about the field type, possible candidate archetypes, or background information including definitions of terms. If boundary boxes have not been used, the constrained value set may be defined here (eg Male / Female / Other).

 

  • Comments:
    • The comments section should only be used for discussion between different people working on the project. If any aspect of the data point needs to be changed, it should be documented within the name field, the labels, comments or markers.

 

  • Markers:
    • Please use the custom markers we have provided (“openEHR-infogather.xmp” – a zip file containing the file can be found here).
    • full_match  Full match – available archetype usable without modification.
    • no_match  No match – new archetype needed.
    • partial_match  Partial match – available archetype usable, but will require modification.
    • indeterminate  Indeterminate – further analysis required.

 

More detailed descriptions can be found in Part 1 and Part 2 of this article.

openEHR Modelling with XMind – Part 2

Part 1 of this series can be found here. 

In this series of posts we document how we began the process of modelling our data using mindmaps. The aim is to document our processes in an open and reproducible manner.

In Part 1 we examined the initial stages of mindmapping. Here in Part 2, we look at the next steps, leading up to archetyping in openEHR.

 

Analysing the data points and identifying overlaps

The 2 images below display the different annotations we can use during the analysis phase. Our convention is to use labels, notes and comments for different functions.

labels notes comments

labels notes comments - labelled

 

Labels

These are added to any datapoint by pressing F3, via the drop down menus at the top of the page (Modify > Label), or by right clicking on a data point (Insert > Label). Only one label can be added per node, but this may contain multiple different tags – these are listed in a box beneath the name of the data point.

labels notes comments

For example, in the image above the data point CR0620 has been labelled with the tags “COSDcore” and “Genomics”. This indicates that the data point is present in both the COSD Core dataset and the Genomics dataset. Using the tags therefore allows us to highlight overlaps in datasets.

To keep things simple, please use single words without spaces for the tags (though you can use concatenated words – eg the terms “COSD” & “core” together become “COSDcore”). The tags are separated by commas, as seen in the image above.

Note:

  1. These tags can then be used to search through or filter down the data points, which is why they are so useful.
  2. The tags are also used for assigning to archetypes – see later.
  3. XMind automatically reorders the tags into alphabetical order.

 

Notes

These are added to any datapoint by pressing F4, via the drop down menus at the top of the page (Modify > Notes), or by right clicking on a data point (> Notes). A small icon Notes icon appears to the right of the data point name, when there is a comment attached. Click on the icon to reveal the notes, as on the image below.

Notes

The “notes” function is used for longer text notes about the data point. We recommend that you put any codes from other data points in this section (eg in the above image CR0620 in COSD is equivalent to 14944.1 in Genomics).

You may also choose to document information about the field type including value sets (eg Male / Female / Other) in the notes section.

 

Comments

These are added to any datapoint by clicking on the icon in the top toolbar Add comments icon , via the drop down menus at the top of the page (Modify > Comments), or by right clicking on a data point (> Comments). A small icon Comments icon appears to the right of the data point name, when there is a comment attached. Click on the icon to reveal the comments, as on the image below.

comments

Multiple comments can be added by different people. This section is therefore used for discussion during collaboration.

comments 2

The comments section should only be used for discussion between different people working on the project. If any aspect of the data point itself needs to be changed, it should be documented using the name field, the labels, comments or markers.

 

Identification of commonality with existing archetypes

We aim to Identify which data points can be collected using archetypes that are already available, and which cannot.

  1. Some data points will be completely covered by archetypes that have already been published (full match).
  2. Some data points will require modification of available archetypes (partial match).
  3. Some data points will require the development of new archetypes (no match).

To document this process we have created a series of markers which can be used in XMind. The markers can be imported by opening the file “openEHR-infogather.xmp” – a zip file containing the file can be found here. The markers should be used in the following way:

full_match  Full match – available archetype usable without modification.

no_match  No match – new archetype needed.

partial_match  Partial match – available archetype usable, but will require modification.

indeterminate  Indeterminate – further analysis required.

The easiest way to use the markers is to open the Markers window – via the drop down menus at the top of the page (Window > Markers). The imported markers are usually at the bottom of the window, under the heading “openEHR tasks”.

Markers window

The markers can be inserted by clicking on the node and then clicking on the required marker. The results will look similar to the image below.

Markers

The name of a relevant archetype can also be added as a label – see the labels under CR0620 on the following image.

archetype

 

And there you are! An outline of how XMind can be used for modelling in openEHR. A summary of the processes outlined in Parts 1 & 2 can be found in Part 3.

openEHR Modelling with XMind – Part 1

Introduction

In this series of posts we document how we began the process of modelling our data using mindmaps. The aim is to document our processes in an open and reproducible manner.

Clinical modelling can appear daunting at first. To reduce the complexity, we break the process down into different steps:

  1. Define the data points. eg the datasets from different registries; the different variables being collected in a study.
  2. Identify if data points from the different sources overlap. For example demographic details are usually collected in lots of different registries. e.g. COSD, SACT and RTDS all collect demographic data for the same patient, though the details may vary slightly (eg address at diagnosis vs address during treatment).
  3. After identifying the overlaps, we define the unique data points that need to be collected.
  4. Identify which data points can be collected using archetypes which are already available, and which cannot:
    1. Some data points will be completely covered by archetypes that have already been published.
    2. Some data points will require modification of available archetypes.
    3. Some data points will require the development of new archetypes.
  5. Develop the new archetypes and modify existing archetypes as needed.
  6. Model the business process of data collection (eg how data is processed / collected in the context of an MDT).
  7. Produce templates (corresponding to the forms needed to collect at each point of the pathway) using the archetypes.

This is a wide variation in the way that people approach steps 1 to 4. A mindmapping tool is often used for this purpose – we recommend XMind, available on Windows, Mac and Linux, as well as portable packages. In this series of articles, we describe a standard process for using XMind to model in openEHR.

 

Defining the data points

The aim of the mind map is to give a good overview of how the data is structured, such that other modellers or domain experts can understand the relationship of the data points.

In urological cancer, we began by listing the different registries and data repositories as well as the needs of the different medical specialties – we plotted these as different nodes on the mindmap.

Level 1

We then explored each of these areas to define the datasets. Below we expand on the different sections of the COSD dataset, forming subsets of further daughter nodes until we get to the data points themselves.

Level 2

Level 3

Level 4

Below is a close up of the staging section from the image above, containing the data point names.

Level 4 close up

Our convention is to document any codes for the data points within the node name – eg in the image above, “T category (final pretreatment)” has a code of CR0520 in the COSD core dataset.
Note: the text in yellow are labels attached to each datapoint (see below).

The data point name is usually the lowest level node in the tree – ie no further “daughter nodes” to the right of the data point name.

However, some modellers like to document the value sets for some data points on the next level down using a “boundary” function. Boundaries are boxes surrounding a group of nodes – a boundary can be created by highlighting the nodes and pressing Ctrl-B, by clicking the icon on the top toolbar Boundary icon , via the drop down menus at the top of the page (Insert > Boundary), or by right clicking on a data point (> Boundary).

The example below shows the use of boundaries to define value sets:

  1. The data point “Joint” can have a value of “Left elbow”, “Right elbow”, “Left knee”, “Right knee”, “Left ankle” or “Right ankle”.
  2. The data point “Duration” can have a value of either “0: No swelling or less than 6 months” or “1: Greater than or equal to 6 months”.

Boundaries

This latter approach (using boundaries) can be very useful when dealing with small projects / mindmaps. However they can result in a more cluttered appearance in larger mindmaps. Therefore some modellers break down the large mindmap into smaller mindmaps in the later stages of the project.

 

Part 2 of this series can be found here. In Part 2 we discuss how to categorise the data points, identify links between different datasets and use the mindmap to guide archetyping in openEHR.