Technology Innovation : ITResearch & Development
1. Lumada Data Science Laboratory Contributes to Business and Consolidates Knowledge with Top Data Scientists
Research and development at Hitachi seeks to enhance social, environmental, and economic value. Work is also ongoing on creating global innovation based on Lumada and establishing ecosystems growing in partnership with stakeholders. The Kyōsō-no-Mori (research and development site to accelerate open eco-system activities) established at Hitachi’s Central Research Laboratory in April 2019 went on to establish the Lumada Data Science Laboratory in April 2020. This new laboratory brings together around 100 people, providing them with a venue for collaboration where they can each contribute their particular skills and expertise. Along with engineers from Hitachi’s business divisions, these include leading data scientists with a depth of knowledge in the operational technology (OT) that is essential for the commercial deployment of advanced data science and technology.
As well as seeking to create value by having such a wide variety of data scientists work more closely together and engage in open innovation to grapple with complex and high-level challenges facing customers, the laboratory is also working to expand the data science business by sharing the knowledge acquired from particular projects.
The laboratory’s mission also includes the fostering of talent in the fields of artificial intelligence (AI) and analytics. Its role is to lead data science at Hitachi, exerting its influence both inside and outside the company in a variety of ways, including by holding hackathons that are also attended by people from outside Hitachi, presenting papers at leading conferences, gaining high rankings in international competitions, and publishing articles on engineering.
[1] Discussion at laboratory (left) and examples of external activities (right)
2. Migration Automation for Accelerating DX
The digital transformation (DX) of companies and industries is an urgent task, not least because of the impact of the COVID-19 pandemic. The upgrading of legacy IT assets is essential to making good use of data and achieving ongoing business reforms. The migration of systems to more modern programming languages forms one part of this upgrading, and this process needs to speed up in order to accelerate DX.
To this end, Hitachi has developed an automation technique purpose-designed for migration development that is based on a continuous integration (CI) environment. In particular, the technique uses image processing to automate the task of verifying that new system reports match those of the old system, work that is said to make up between 30 and 50% of testing, having to be done visually. A pilot project demonstrated that the technique reduced the amount of testing work by one third. Moreover, being based on a CI environment means that it also provides an environment for ongoing system enhancements after migration.
Combined with Hitachi’s expertise in techniques for replicating the specifications of existing systems, this helps achieve DX-readiness in legacy systems that make good use of existing assets.
3. Automatic Extraction of Information from Unstructured Documents for Operational Efficiency Improvement
The extraction of information from unstructured documents involves first preprocessing the text in the document and then determining the content, including by use of an information search platform for management and searching. The text extracted from the document is then inserted into a database. Unfortunately, the extracted text needs to be reviewed by a human one entry at a time, a task that in the past has required a lot of work.
Hitachi’s new extraction technique combines relationship extraction with table structure analysis. The former analyzes large amounts of text data on the basis of its grammatical structure to identify the relationships between words, while the latter involves automatic recognition of the table’s surface structure. The new technique automates the sequence of steps from extracting the required information from complex unstructured documents on the basis of the text’s structural patterns and table’s surface structure through to its insertion into a database.
While it is currently being deployed in practical applications that primarily involve documents in Japanese, the intention is to expand its scope in the future to cover the English language also.
4. Hitachi Global Data Integration Solution
For global corporations to collect and use Internet of Things (IoT) data from equipment located around the world, they need to be able to acquire and operate IoT connections in each location and have systems for the reliable collection and storage of the rising volumes of equipment data. This has led Hitachi to develop the Hitachi Global Data Integration solution that features platform technologies for global communications control and scalable data collection and storage.
Platform technologies for global communications control connects via an application programming interface (API) to the IoT platforms of the different local telecommunications providers to deliver centralized control and management of connections. This enables management of billing and the connection status, opening and suspension of mobile links to equipment in the different locations. Scalable data collection and storage, meanwhile, uses multiple horizontally scalable machines to handle incoming data, allocating processing to machines with low loads. This includes both the processing of collected data as it comes in and the execution of the APIs that customers use to access data. By doing so, high throughput can be maintained even as data volumes grow.
In the future, Hitachi intends to contribute to sustainable industrial development around the world by using Hitachi Global Data Integration to accelerate the establishment and operation of customers’ global IoT businesses.
[4] Global communications control platform for centralized control and management of communication links
5. PSI Technique for Extracting Intersections Contained in Encrypted Data
Amid the rapid advance of information processing techniques such as the IoT and AI, the wellspring of organizational competitiveness is shifting toward businesses that make use of data and society itself is moving toward a situation where new added value arises from organizations connecting with one another by means of data. However, in cases where the data shared between multiple organizations contains sensitive information, there is also a need to limit the disclosure of data to only that which is needed for the intended purpose.
Unfortunately, it is difficult for organizations to limit sharing in this way while still keeping that data confidential from one another. To overcome this problem, Hitachi has developed a private set intersection (PSI) technique that utilizes functional encryption and implementation techniques to identify the intersections between information held by different organizations (the common information held by both parties) without decrypting the data.
Future plans include the addition of new functions such as the extraction of similar as well as intersecting information. The intention is also to engage in collaborative creation (co-creation) with customers to utilize the technique in practical applications.
6. AI Technology for Human Action Recognition with Slight Variation in Motions
Safety and security measures that involve the application of AI to surveillance camera video as a means of identifying people are becoming increasingly common. While these techniques are used for purposes such as helping keep factory workers safe by analyzing their actions or the recognition of suspicious behavior in public places, by recognizing which a part of the person is occluded or where there is slight variation in motion that is difficult to distinguish.
The past practice of only using video for AI learning has made it difficult to teach AI to identify small actions. In response, Hitachi has developed a cross-modal AI learning technique that combines camera video with signals from a variety of body-worn sensors. Such an AI learns using a mix of sensors and video can identify even slight variations in motions from camera video without using other sensors, and achieve highly accurate action recognition even when part of the body is obscured. Compared to action recognition trained using video on its own, this delivers improvements of up to 53% in the accuracy of recognizing actions such as opening a cashbox or using a smartphone.
In the future, Hitachi plans to deploy the technique in video surveillance systems and commercialize it in solutions for suspicious behavior detection or the rapid identification of unwell people.
[6] Cross-modal learning technique using sensor information to augment video recognition
7. Speaking Time Detection Technique for Analyzing Meetings
Prompted by COVID-19, companies are accelerating the transition to telework and making greater use of online videoconferencing. Companies are also starting to adopt speech recognition as a means of taking minutes at such online meetings and for analyzing verbal interactions. Unfortunately, as many speech recognition techniques are intended to work on a single person’s voice only, they need to be augmented with some way of identifying when a different person starts speaking if they are to be used to analyze events such as meetings. Existing technology is also impractical for meeting use because of its inability to detect when more than one person is speaking at the same time.
In response, Hitachi has developed a way to detect when people are speaking that uses a neural network trained using sample audio containing instances of multiple people talking at once. Whereas past techniques were unable to detect periods of time when more than one person is speaking, this new technique can do so with an accuracy of 80% or better. The technique is currently being used in Hitachi’s dictation support service, with demonstration projects underway for analyzing verbal interactions in situations such as call centers.
In the future, Hitachi intends to contribute to improving service quality for corporate customers by leveraging this new technique to expand its dictation support service.
[7] Comparison of speaking time detection performance by old and new techniques
8. Learning-based Image Compression Technique Using Selective Detail Decoding
The amount of data generated by the IoT is predicted to reach around 79.4 zettabytes*1 by 2025*2. In response, Hitachi has been paying particular attention to image data and is working on an image compression technique based on deep learning so that this ever-increasing flow of data can be put to use.
Images may contain a mix of regions, some showing objects such as flowers or other plants where, rather than the detailed structure of each and every leaf, it is things like texture that are important, and others showing things like fine print where the structural detail is important. In response, Hitachi has developed a selective detail decoding technique that optimizes the compression and decompression of different regions of an image on the basis of perceptual image quality (image quality as perceived by human vision). This can improve perceptual image quality even when the available data size is very small.
Hitachi intends to expedite this research and look at utilizing it in IoT solutions that handle large amounts of data.
Note that this is a part of joint research with the Aizawa Laboratory at the Department of Information Communication Engineering, Graduate School of Information Science and Technology of the University of Tokyo. It also utilized the AI bridging cloud infrastructure (ABCI) of the National Institute of Advanced Industrial Science and Technology.
- *1
- The zetta prefix denotes a factor of 1021.
- *2
- Source: “The Growth in Connected IoT Devices Is Expected to Generate 79.4ZB of Data in 2025, According to a New IDC Forecast”, IDC, https://www.idc.com/getdoc.jsp?containerId=prUS45213219
9. Commercialization of CMOS Annealing in Time of COVID-19
Hitachi is developing a complementary metal-oxide semiconductor (CMOS) annealing technique for quickly finding practical solutions to combinatorial optimization problems. The technique can be used for purposes such as large-scale scheduling or portfolio optimization.
Following Japan’s May 2020 declaration of a state of emergency over COVID-19, Hitachi’s Research & Development Group adopted a shift-based scheme for offsetting when staff arrive at work. The scheme, which applies to around 360 research staff at its Central Research Laboratory, uses a high-speed shift scheduling solution in which CMOS annealing is used to generate the shifts. As research work requires access to specialized equipment, a four-shift system was adopted for staff arrival times and the solution used to specify the detailed requirements and generate shifts accordingly in a way that is efficient and ensures that research work proceeds smoothly while also minimizing the COVID-19 infection risk. The resulting shifts for several hundred staff enable them to avoid the “three Cs” of closed spaces, crowded places, and close-contact settings, taking account of constraints that include the research team they belong to, what they are working on, the progress of experiments, usage of experimental apparatus, desired number of working days, and commuting time.
The high-speed shift scheduling solution using CMOS annealing entered service in the second half of 2020.
10. AI-based IT Operation Technologies in IT Operations Optimization Service for IT Administration
An area of growing investment, AIOps refers to the use of AI to support human work by applying AI and machine learning to system administration in order to reduce the cost of IT management. There is particularly high demand from the labor-intensive tasks of event handling and reporting. Hitachi has launched an IT operations optimization service for IT administration and has developed techniques for automating suggestions for reference documents that describe selective actions and anomaly detection for use in these respective tasks.
To deal with the problem of choosing each reference document on the basis of different message text, the technique for automating suggestions for reference documents that are selectable uses machine learning to learn thresholds for each reference document based on the event notifications from the IT system, with the percentage of matching words in the message being used as an indicator of similarity. When a new event occurs, it calculates the similarity between the event and the available reference documents, selecting those reference documents that exceed the threshold. In testing the new technique achieved a selection accuracy of more than 95%.
The technique for automating anomaly detection automates the identification of performance anomalies in IT systems, a task that in the past was performed manually by experienced IT administrators. It uses machine learning to learn the temporal patterns of performance indicators and automatically detects anomalies that deviate from normal behavior. Computation time has been shortened by including built-in temporal patterns for those patterns frequently exhibited by IT systems. This shortens to a few hours the time it takes to identify anomalies in an IT system made up of several hundred servers, a task that in the past would have taken a week.
Plans for the future include automatic execution of selected actions and the shortening of report generation times.
11. 5G Engineering Technology for 5G Platform
With the aim of overcoming societal challenges such as labor shortages, Hitachi develops techniques for managing edge computing to facilitate the adoption and operation of digital solutions that use fifth-generation (5G) telecommunications. It has now developed 5G engineering technologies that enable the rapid deployment and ongoing operation of high-quality wireless communication environments that are rugged enough to withstand use in manufacturing plants and other such sites.
The quality requirements for the transmission of control and image data by equipment in the field differs between customers, with on-site wireless communication conditions also subject to continual change. In response, Hitachi is able to assess wireless communication conditions in real time (on the order of seconds) by using a high-speed radio propagation simulator, which features a graphical processing unit (GPU) and performs an approximate scattering calculation for diffraction analysis. Another technology is to enable the rapid deployment of highly reliable communications (with a packet loss rate of 10−6 and latency of 50 ms) using connectivity techniques that feature routing redundancy and virtual network slicing based on selecting a quality-of-service (QoS) level that suits the customer’s requirements and site wireless communication conditions.
Hitachi intends to continue building its 5G solutions business, using the 5G test environment it has established at its Kyōsō-no-Mori facility to test a variety of applications.
12. Edge Computing for 5G
With the aim of overcoming societal challenges such as labor shortages, Hitachi develops edge computing technology to facilitate the adoption and operation of digital solutions that use 5G telecommunications.
As networks and other equipment installed in the workplace are subject to numerous constraints, a large amount of design work goes into the installation of such systems. Accordingly, Hitachi has recently developed edge orchestration techniques for the remote provision of functions such as the use of video to analyze work progress based on an understanding of customer requirements and the constraints of the workplace OT environment.
Furthermore, Hitachi has developed lightweight AI techniques that downsize deep neural network inference models automatically. Although it is difficult to implement computationally intensive AI under given constraints such as the limited processing performance of devices in the workplace OT environment, by lightweight AI techniques, it is possible to implement real-time AI on such performance-constrained devices.
With these techniques, it is possible to deploy functions in less than a minute, and without resort to specialist expertise, that in the past would have taken more than an hour to install. Hitachi intends to utilize these practices in a wide range of applications in the future.
13. Knowledge Utilization Solution for Machine Analysis and Presentation of Maintenance Records
In the maintenance of electric power infrastructure and other assets, maintenance archived records itemizing work done are used as a source of knowledge in activities such deciding how to respond to faults. Unfortunately, because they are made up of unstructured data (also called “dark data”) that is difficult to analyze, making use of these records takes a lot of effort and cost, needing to be read through by a human one entry at a time.
In response, Hitachi has developed a knowledge utilization solution in which a system for presenting fault classification details facilitates faster recovery from faults by using the machine analysis of maintenance records to identify the nature of faults and the countermeasures adopted. The information is presented to users in the form of a fault classification tree.
The system uses Hitachi’s smart dictionary platform* to identify the distinctive terminology used in maintenance records, extracting instances of maintenance work, analyzing the relationships between the faults and associated countermeasures, and then showing the results as a fault classification tree diagram. For a fault specified by a search query, which is displayed in grey at the top of the diagram, the fault classification tree shows other faults that have occurred simultaneously in past instances of this fault (blue) and the associated countermeasures (yellow-green), with the tree being structured in a way that indicates their classification. This enables information from a large volume of maintenance records to be reviewed rapidly and all together.
- *
- For more information about the smart dictionary platform, see article 15. “Smart Dictionary Platform for Extracting Meaningful Information from Dark Data.”
[13] Asset maintenance using system for presenting fault classification details
14. E2E Data Management by Extended Lineage for Secure Analysis with Sensitive Data
The prospect of a data-driven society in which new values are created and made available to the public using a wide variety of data with analytical techniques such as AI has attracted attention in recent years. For example, public sectors are expanding social welfare services such as healthcare by analyzing patients’ personal information and other various data in order to deliver better health and welfare outcomes.
However, a transparent analysis process is needed for sensitive data such as personal information so as to show how the data is used and what it is used for. Accordingly, it has been necessary in the past for people to put a lot of time into reviewing and verifying analysis processes through trial and error.
In response, Hitachi has developed an end-to-end (E2E) data management technique that can reproduce analysis processes. This uses the extended lineage technique to record the sequence of steps in an analysis, covering feature selection, model building, and evaluation along with data relationships. In the future, Hitachi intends to help create a safe and secure society by applying this technique to the analysis processes management in fields such as social security and transportation to improve the quality of public services.
15. Smart Dictionary Platform for Extracting Meaningful Information from Dark Data
There is rising demand for the extraction of commercially useful knowledge from dark data, meaning text and other forms of data that traditionally have not been made use of, materials informatics (MI) being one such example.
MI works by collecting and analyzing large amounts of testing data and improves the efficiency of new materials development through its use to identify materials with properties that satisfy particular requirements or to assess performance under optimal conditions. As these benefits are enhanced when more test data is available, rising demand for gathering such data from patent documents and other text means there is an urgent need to devise highly accurate information extraction methods.
Although ways of using machine learning to automatically extract information from text have been proposed at the academic level, the large amount of work needed to improve their accuracy has posed an obstacle. Hitachi’s smart dictionary platform, on the other hand, is ready for actual business applications. Along with the automatic extraction of information, it is also able to make itself more accurate by using AI to automatically identify and rectify factors that reduce accuracy.
Hitachi is trialing the platform with a wide variety of partners to demonstrate its utility and intends to contribute to new business development in a range of sectors, encompassing maintenance support services as well as MI.