Last Updated: ‣
Introduction
This blog post keeps a record of my summer internship journey at Himax Imaging Corp. lasting for 3 months in 2024 summer. This is the first internship of my lifetime. I was offered a software / machine learning engineer position, the responsibility of which is
- to develop and improve the computer vision AI model residing within a small chip (or more formally, an AI module product ISM-028) designed by the other team of our company, and also
- to build a laptop application for demonstration of the capability and applicability of our chip.
Holding two master degrees of electrical engineering and computer science respectively, this intern opportunity really fits my background and interest. Especially with my previous project Hand-Writing Robot, AI-Powered Driver Companion Mobile App, Deep-Learning based Natural Language Processing research, the experience in software quick prototyping, and deep learning research equips me with a lot of skills and knowledge that I can tap into for this intern.
Background
This internship project has 3 main goals:
- to develop advanced AI-powered features provided by the AI chip ISM-028, our company designed
- to optimize models in accuracy and speed
- to demonstrate the usability and potential of ISM-028.
What is an AI PC?
Endpoint AI vs Edge AI vs Cloud AI
to write
Always-On-Sensing (AOV) and Always-on-Vision (AONV)
to write
Goals
Model Development for More Downstream Computer Vision Tasks
Currently, we already have one vision model that is capable of doing tasks such as face detection and face orientation. We also developed some couples of models doing eye tracking, face landmarking/meshing, posture, and gesture detection. These features are very important in our product. However, we would also like to incorporate higher-level features that are closer to the laptop user scenarios. One of them is Emotion Detection. Just as how Jarvis will know if Tony Starks is struggling with some tasks, this feature enables the other applications to know the user’s emotion and find something for the user even before it was asked to do so. Just imagine being played a soothing background music when it detects your depression when working on your stressing deadline. This is the potential that our AI module could give to the other applications.
Model Optimization
The other important thing in this project is to improve the model performance and improve the inference speed. Especially the latter is quite important because it has to energy efficient, saving the laptop battery life. One of the ways for such optimization is to compress our models to reduce the time period of the peak usage of the chip, thus saving more power of the laptop. This makes possible that our module provides energy-efficient always-on operation, so that the laptop can effortlessly be able to listen to and see the user without draining their battery quickly.
Windows Desktop Application Development
In order to demonstrate the potential of our product to some of our big clients, the laptop manufacturers, we also need to show why and how our AI modules will benefit your laptop. So, we try to develop a Desktop Application that runs on Windows Operating System in laptop that provides many useful features powered by our AI module.
In order to do that, we have to put into our chip feature Keyword Spotting (KWS) so that the Smart Assistant living inside your desktop could be summoned just by casting the spell “Hi WiseEye!”. Then, you can directly tell this secretary to do anything for you, or it can tell you some tips to manage the stress when it sees stressed you, for example.
Framework and Tool Decision For Windows Desktop Application Development
There are a couple of frameworks that you can use to develop Windows Desktop Applications. A common one is to use JavaScript with Electron for modern application development. We go for this one because we also need a GUI for this app and the Front-End framework (ReactJS), the rich libraries of the Back-End and active community of JavaScript world gives us the useful third-party packages that you can directly download and use in your application. This improves our development speed and is a perfect fit for the purpose of our demo application development.
Also, if you ran into any issues, you can always find solutions on internet due to its large and powerful community. The only consideration we need to take is that some OS-level features might not be in NodeJS Package Index (
npm
). It would need C++ or C# with Windows API / Windows App API, or even Win32, WinRT, or UWP API in some rare cases that you are looking for very low-level features like keyboard backlights control.There is also an alternative way to develop a native application with JavaScript, which is React Native. It does not only provides development of iOS, Android this native mobile app, but also for native desktop app. For Windows, it is called React Native for Windows. We did try that, but it still seems buggy, since it React Native only supports Windows since 2023, which is only one year at the moment. Therefore, we switched to using ElectronJS.
System Diagram
UI Design
…
Environment Setup for React-Native
Using CLI (Failed)
Follow the steps in React Native Windows CLI · React Native for Windows + macOS (microsoft.github.io)
$ Set-ExecutionPolicy Unrestricted -Scope Process -Force $ iex (New-Object System.Net.WebClient).DownloadString('https://aka.ms/rnw-vs2022-deps.ps1'); $ npx --yes react-native@latest init wise --version "latest" $ cd wise wise\ $ npx --yes react-native-windows-init --overwrite
Failed to launch the application by
npx react-native run-windows
The build is successful, but the error message below occurred during deployment:
× Failed to deploy: ERROR: ReflectionTypeLoadException: Unable to load one or more of the requested types. Retrieve the LoaderExceptions property for more information. Command failed. Re-run the command with --logging for more information. × Deploying C:/Users/550030/wise/windows/x64/Debug/wise/wise.build.appxrecipe: ERROR: ERROR: ReflectionTypeLoadE... ERROR: Exceptions from the reflection loader: ERROR: FileLoadException: Could not load file or assembly 'NuGet.VisualStudio.Contracts, Version=17.10.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' or one of its dependencies. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040)
- Failed to deploy. Package NuGet.VisualStudio.Contracts was missing.
- The package NuGet.VisualStudio.Contracts contains RPC contracts for NuGet’s Visual Studio Service Broker extensibility APIs. These APIs are designed to be usable with async code and are available in this package using Visual Studio’s IServiceBroker.
dotnet list windows\wise.sln package
also does not work 😟
The error shows that
SolutionDir
environment variable is empty and the node_modules\react-native-windows\package.json
could not be found. ⇒ why is it empty? set the variable as a workaround?With Visual Studio Code (Failed 😟)
- install extension
- create a file in
.vscode\launch.json
with the configuration for Debug
- run Debug
With Visual Studio (It works!)
npx react-native autolink-windows
- build, deploy and run on local machine or just debug
Environment Setup for Electron with React
Frontend framework: React
Package Manager: Yarn
React framework: None, just use bundler Vite
Tooling: Vite
TypeScript transpiler: esbuild by vite instead of tsc
CSS: tailwind
Unit Testing: vitest (jest or react-testing-library)
Integration Testing: Cypress
Plugins:
Setup for React and Electron with TypeScript
$ yarn add -D vite $ yarn create vite √ Select a framework: » Others √ Select a variant: » create-electron-vite ↗ √ Project template: » React # this will come with TypeScript, React, and Electron, but it has some issues installing electron # using this script. [1/2] ⢀ esbuild error C:\Users\550030\wise-electron\node_modules\electron: Command failed. Exit code: 1 Command: node install.js Arguments: Directory: C:\Users\550030\wise-electron\node_modules\electron Output: RequestError: unable to verify the first certificate # to solve this, install electron manually by `yarn add --dev electron` $ yarn install $ yarn dev
The application will be launched by using
yarn dev
, and a desktop application shows up as follows.Setup for Tailwind CSS
‣
Check out the other blog post of mine about Exploring PostCSS and Tailwind CSS: A Modern Approach to CSS.
Tailwind CSS uses postcss and autoprefixer packages, so we have to install them first too.
$ yarn add --dev tailwindcss postcss autoprefixer # tailwindcss init flags: # --esm Initialize configuration file as ESM # --ts Initialize configuration file as TypeScript # -p, --postcss Initialize a `postcss.config.js` file # -f, --full Include the default values for all options in the generated configuration file $ npx tailwindcss init -p Created Tailwind CSS config file: tailwind.config.js Created PostCSS config file: postcss.config.js
Add paths to the
content
field of tailwind.config.js
content: [ "./index.html", "./src/**/*.{js,ts,jsx,tsx}", ],
Add the Tailwind directives to your CSS: Add the
@tailwind
directives for each of Tailwind’s layers to your ./src/index.css
file.@tailwind base; @tailwind components; @tailwind utilities;
Use react-chatbot-kit
‣
First, create a subfolder,
Chatbot
, under src/components/
to keep all chatbot-related files in a single place. In this folder, it containsChatbot/ config.js MessageParser.jsx ActionProvider.jsx Chatbot.tsx Chatbot.css
The first three files are suggested by the library.
For
config.js
,import { createChatBotMessage } from 'react-chatbot-kit'; const config = { initialMessages: [createChatBotMessage(`Hello world`)], }; export default config;
For
ActionProvider.jsx
, we have to copy paste the following content:import React from 'react'; const ActionProvider = ({ createChatBotMessage, setState, children }) => { return ( <div> {React.Children.map(children, (child) => { return React.cloneElement(child, { actions: {}, }); })} </div> ); }; export default ActionProvider;
For
MessageParser.jsx
, import React from 'react'; const MessageParser = ({ children, actions }) => { const parse = (message) => { console.log(message); }; return ( <div> {React.Children.map(children, (child) => { return React.cloneElement(child, { parse: parse, actions: {}, }); })} </div> ); }; export default MessageParser;
The
Chatbot.tsx
is a file that I specifically created for other files to call upon and retrieve the MyChatbot
React Component to be used in other places. Inside this file, as suggested by the documentation of react-chatbot-kit, copy paste the following import Chatbot from 'react-chatbot-kit' import 'react-chatbot-kit/build/main.css' import config from './config.js'; import MessageParser from './Messageparser.js'; import ActionProvider from './ActionProvider.js'; export const MyChatbot = () => {... return ( <div> <Chatbot config={config} messageParser={MessageParser} actionProvider={ActionProvider} /> </div> ); ... };
Then, we can fire up the application by
yarn dev
, which will run electron
. The rendered page in electron is Chatbot Integration
I reuse the chatbot I implemented using LangChain.js framework in TypeScript format, so here I only need to integrate the chatbot process with the React UI components. After about one to two hours of working to solve some async issues, the work has been down, mostly the
MessageParser.jsx
, and the ActionProvider.jsx
. Also, I got to solve one existing bug in the original chatbot process I developed, where it has some amnesia problem, by removing the redundant question rewriting model.The First Natural Command - Set Brightness
Next, because it is the first time I use ElectronJS to develop a Windows Desktop App, I am still not sure if or how it is able to do system control. So I go for implementing a OS-level feature, set screen brightness, to make sure this framework is able to do that.
The first thing to do is to implement a message parsing function that is able to know when it is a command that the assistant has to send some commands to the OS, which I simply use regex first. The second thing to do then is to execute the command, which is where things got tricky.
After 3 to 4 hours of working and debugging, it works fortunately. The key concept here is that in ElectronJS, it has
- Renderer process that runs within a browser that can not have access to the OS directly (like a virtual environment or a virtual world)
- Main process that runs on the operating system, just like back-end, which has access to the OS directly, and has controls over the browser that the renderer process runs within and the application lifetime
In order to enable our react app to communicate with the OS, we have to use the feature Inter-Process Communication (IPC) feature provided by ElectronJS in its
preload.ts/js
along with main.ts/js
and renderer.ts/js
.The result is shown below.
C++ Event Listener
- FT4222 Library
- Check EPII_ISP_TOOL
To Investigate
- C/C++ addons for nodejs
- rollup
- node api
- node.h
- NAN
- C# / .NET as Backend
- .NET Core with TypeScript