The description from the research group is
Based on testing with a proof-of-principle to generate disassembly maps (in array format) what would you recommend to our team as priority data collection/processing/cleansing steps?
There are a variety of sources of data. See [1] to [5] in the doc file above. Instead of focusing on what can be extracted from these data sources or what should be the data collection process. We invert the problem to: What data is needed for disassembly maps?
For this purpose, we obtained a disassembly map example and the data needed to create this map.

Figure 1: Disassembly map from the research group for the Fairphone Disassembly.
The disassembly map in Figure 1 is given by the Research Group. The aim is to build an AI model which takes input from the various data sources and provides a text-delimited file to create such a disassembly map.
This map is auto-generated (as far as we are aware) with the help of the data in the file
Dissassembly code map Fair Phone 2.xlsx
The concept of the disassembly map came from the paper by De Fazio
1_s2.0_S0959652621027608_main.pdf
Based on the data to create the disassembly map, we see that there are multiple tables with legends. And finally the map is created with the ‘connection table’. Our initial/foremost advice is to combine the relevant data from all your data sources into a usable database. This would indirectly involve data pre-processing and cleaning steps. We have two possible database solutions.
This is our most preferred solution. However, it involves a fair bit of software and programming knowledge. In this approach, you can create a relational database (RDBMS) For example a postgreSQL database.
In this database, there will be tables similar to the tables in your excel file. Since it is a relational database, you can link different tables with each other. In such a way, the final connection table, the table containing all information for the disassembly map, can be created by linking multiple tables.
We expect that such a database will be very useful in the subsequent step of creating an AI model for the disassembly process. The database can be useful in investigating the use of different models like simple first-principles-based tree models, graph neural networks, or even LLMs.
Furthermore, a database will provide a lot of flexibility and ease-of-use for multiple other projects. For instance, the possible use cases are: