Project & Thesis Market at ITU 2024 - Master thesis & projects
at MAN Energy Solutions
2450 København, Region Hovedstaden, Denmark -
Start Date | Expiry Date | Salary | Posted On | Experience | Skills | Telecommute | Sponsor Visa |
---|---|---|---|---|---|---|---|
Immediate | 27 Nov, 2024 | Not Specified | 29 Aug, 2024 | N/A | Good communication skills | No | No |
Required Visa Status:
Citizen | GC |
US Citizen | Student Visa |
H1B | CPT |
OPT | H4 Spouse of H1B |
GC Green Card |
Employment Type:
Full Time | Part Time |
Permanent | Independent - 1099 |
Contract – W2 | C2H Independent |
C2H W2 | Contract – Corp 2 Corp |
Contract to Hire – Corp 2 Corp |
Description:
TOPIC 1: SHOP TEST DIGITALIZATION
Keywords: Image processing, document AI, software engineering, human-in-the-loop, diverse data
The first of our three projects will try to accomplish a dream which has existed for many years in the company. We receive performance measurements of vessels when they were originally built (shop tests), and our goal is to detect performance deficiencies by comparing operational conditions with these initial performance measurement (shop test). The challenge is, that there are multiple conductors of the performance measurement process, and results are often supplied with PDF documents.
The task is produce structured information from the unstructured nature of PDFs. We know that every shop test must include the information, but the way that information is presented differs from each supplier. We have developed an old-school proof-of-concept solution centering around inferring the underlying row- and columnar format of the underlying Excel table, and creating a “signature” from this. This signature can then be compared to a memory bank of template signatures, and if we find a match, we can “overlay” the template and extract the text fields using e.g. OCR technology or using PDF metadata. One goal is to complete this “signature” method as a possible baseline, which may involve a human-in-the-loop with a LabelStudio interface.
Once a baseline is developed, you could try to improve upon it using cutting-edge technologies such as PaddleOCR table detection and using deep-learning to produce a more flexible method of extracting the information. Once you have built a solution, you can validate your method against our thousands of human annotated measurements to quantify the quantitative improvements of your method, but maybe more importantly qualitative benefits such as flexibility or the omission of a human-in-the-loop.
Succeeding in this task will improve the data quality of existing products helping to guide vessel owners towards more efficient usage of their engines.
Responsibilities:
Please refer the Job description for details
REQUIREMENT SUMMARY
Min:N/AMax:5.0 year(s)
Information Technology/IT
IT Software - Other
Software Engineering
Graduate
Proficient
1
2450 København, Denmark