@PeterCxy It's basically the main idea of VLIW architecture. Unlike a traditional superscalar CPU, in VLIW, the order of execution is controlled by the instructions, not the CPU, which meant the responsibility of optimizing a program for instruction-level parallelism is transferred back to the programmer or the compiler. Intel Itanium was originally started based on a similar idea.
Unfortunately, VLIW design and VLIW compilers are still open questions.
https://en.wikipedia.org/wiki/Very_long_instruction_word
https://en.wikipedia.org/wiki/Explicitly_parallel_instruction_computing