TWI845274B - A cell quality prediction system and method, as well as an expert knowledge parameterization method. - Google Patents

A cell quality prediction system and method, as well as an expert knowledge parameterization method. Download PDF

Info

Publication number
TWI845274B
TWI845274B TW112115189A TW112115189A TWI845274B TW I845274 B TWI845274 B TW I845274B TW 112115189 A TW112115189 A TW 112115189A TW 112115189 A TW112115189 A TW 112115189A TW I845274 B TWI845274 B TW I845274B
Authority
TW
Taiwan
Prior art keywords
data
cell
model
prediction
combination
Prior art date
Application number
TW112115189A
Other languages
Chinese (zh)
Other versions
TW202443594A (en
Inventor
陳振耀
王榮華
陳明哲
黃仁傑
林映任
賴易鍾
Original Assignee
國立臺灣海洋大學
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 國立臺灣海洋大學 filed Critical 國立臺灣海洋大學
Priority to TW112115189A priority Critical patent/TWI845274B/en
Application granted granted Critical
Publication of TWI845274B publication Critical patent/TWI845274B/en
Publication of TW202443594A publication Critical patent/TW202443594A/en

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The present invention discloses a cell quality prediction system and method, as well as an automatic anomaly data detection method. A certain amount of retrospective data is preprocessed, expertly labeled to produce a training data set and structured data. The training data set is subjected to big data analysis or machine learning algorithms to obtain a first probability distribution model. The structured data contains expert knowledge and is used to tune the first probability distribution model to obtain a second probability distribution model. The second probability distribution model is used to predict the cell quality. The present invention is characterized in that the first probability distribution model can be indirectly modulated by a user's expert knowledge. During the prediction stage, the user can select parameters they wish to include or exclude the structured data.

Description

細胞品質預測系統、方法及其異常數據自動偵測方法 Cell quality prediction system, method and abnormal data automatic detection method

本發明係揭露一種細胞品質預測系統、方法及其異常數據自動偵測方法,其係藉由異常數據自動偵測方法,偵測一定量回溯性資料並與使用者進行確認及修改,以確保資料品質,再由專家將其進行標註,以獲知在特定參數組下,細胞產生之狀態,再將該些經標註後之資料以大數據分析或學習演算法進行運算,得出一機率分布模型,用以預測往後事件發生之機率,可應用於養殖及人工生殖領域。 The present invention discloses a cell quality prediction system, method and abnormal data automatic detection method thereof. The abnormal data automatic detection method detects a certain amount of retrospective data and confirms and modifies it with the user to ensure the data quality. Then, experts annotate it to obtain the state of cell production under a specific parameter set. Then, the annotated data is calculated by big data analysis or learning algorithm to obtain a probability distribution model to predict the probability of future events. It can be applied to the fields of breeding and artificial reproduction.

據World Population Review統計指出,臺灣於1955-2022年間生育率由3.87%下降至0.18%,已面臨新生兒減少、人口結構老年化、未來國家競爭力衰退的國安危機,據統計,我國不孕症盛行率為15%以上,即每八對夫妻中就有一對為不孕所困擾,所幸臺灣生殖醫學技術於此現為世界龍頭,其成功率為世界第二、亞洲第一。目前不孕症主要治療方式為人工受孕(試管嬰兒),但傳統療程皆為胚胎師依靠肉眼透過顯微鏡判斷囊胚品質,可能因視野、亮度、角度、個人經驗及當下身體狀況,而造成人為判斷誤差,若能加入 AI技術學習不同胚胎師之經驗,便可綜合其評斷標準,產生科學客觀評估模型,以降低主觀誤差。 According to World Population Review statistics, Taiwan's fertility rate has dropped from 3.87% to 0.18% between 1955 and 2022. The country is facing a national security crisis with a decrease in the number of newborns, an aging population structure, and a decline in national competitiveness in the future. According to statistics, the prevalence of infertility in Taiwan is over 15%, which means that one in eight couples is troubled by infertility. Fortunately, Taiwan's reproductive medicine technology is now a world leader, with a success rate that is second in the world and first in Asia. Currently, the main treatment for infertility is artificial insemination (IVF), but in traditional treatments, embryologists rely on the naked eye through a microscope to judge the quality of the blastocyst, which may cause human judgment errors due to vision, brightness, angle, personal experience and current physical condition. If AI technology can be added to learn the experience of different embryologists, their evaluation standards can be integrated to produce a scientific and objective evaluation model to reduce subjective errors.

在囊胚品質評估方面,早在1999年即有一套沿用至今之胚胎評級系統(Gardner system),其根據囊胚擴張程度和孵化(hatching)狀態以及其內細胞群(ICM)之紋理及滋養層細胞(TE)之緻密度給予胚胎評級,國內外雖有眾多利用AI技術輔助於囊胚品質的分級判斷,但仍多為僅使用單張影像,且不跳脫形態動力學(Morphokinetic)方法。 In terms of blastocyst quality assessment, as early as 1999, there was an embryo grading system (Gardner system) that is still in use today. It grades embryos based on the degree of blastocyst expansion and hatching status, as well as the texture of its inner cell mass (ICM) and the density of trophoblast cells (TE). Although many domestic and foreign methods use AI technology to assist in grading blastocyst quality, most of them still only use a single image and do not deviate from the morphokinetic method.

中華民國專利公告第I781408號揭露一種利用高光譜資料分析技術之人工智慧的細胞檢測方法及其系統,其目的為降低於試管嬰兒療程中,醫師以主觀方式判斷胚胎優劣,造成目前治療不孕症之成功率難以提升,以及容易受到不同醫師主觀意見造成誤判,導致成功率降低之問題。其採用高光譜儀於胚胎生長期間連續拍攝,取得數張影像,由人工智慧方法分析特定兩個時間點內,胚胎所產生之各種化學成分改變,進而檢測細胞品質或辨識細胞,其應用場景及其目的雖與本發明相似,但其僅使用高光譜儀影像進行分析,且該案於專利範圍內並未教示該系統可進行細胞著床率預測,而本發明之專利範圍內已揭示結合病歷資料、專家知識與結構化資料進行細胞著床率與細胞品質預測之方法。 Patent Publication No. I781408 of the Republic of China discloses an artificial intelligence cell detection method and system using hyperspectral data analysis technology. Its purpose is to reduce the problem that during the IVF process, doctors judge the quality of embryos subjectively, which makes it difficult to improve the success rate of infertility treatment and is easily misjudged by the subjective opinions of different doctors, resulting in a lower success rate. It uses a hyperspectrometer to continuously shoot during embryo growth to obtain several images, and uses artificial intelligence methods to analyze the changes in various chemical components produced by the embryo at two specific time points, and then detect cell quality or identify cells. Although its application scenarios and purposes are similar to those of the present invention, it only uses hyperspectrometer images for analysis, and the patent scope of the case does not teach that the system can predict cell implantation rate, while the patent scope of the present invention has disclosed a method for combining medical records, expert knowledge and structured data to predict cell implantation rate and cell quality.

中國專利公告第CN109544512A號揭露一種基於多模態的妊 娠結果預測裝置,目的為提高試管嬰兒之成功機率,其利用人工方式,將發育中之胚胎影像進行對焦拍攝,獲取一胚胎之囊胚影像、一內細胞群影像及一滋養層細胞影像共三張影像,再將懷孕率結果作為標籤,利用神經網路進行訓練可預測懷孕率結果之模型,其方法須針對胚胎影像進行手動變焦,拍攝三張不同焦距之照片,以進行懷孕率預測,該方法顯然存在因主觀認定,使得各使用者於拍攝不同焦距照片時產生誤差,導致結果難以驗證。而本發明之方法係將包含影像之病歷資料進行異常數據自動偵測,包含資料清洗、資料標準化及正規化,並自動進行預測,使用者在資料預測時無需再手動調整取得資料,不再存在人為主觀判斷之問題。 Chinese Patent Publication No. CN109544512A discloses a multi-modal pregnancy prediction device, which aims to improve the success rate of test tube babies. It uses artificial methods to focus and shoot the image of the developing embryo to obtain a total of three images, namely, an image of the blastocyst, an image of the inner cell mass, and an image of the trophoblast cell. The pregnancy rate result is then used as a label to train a model that can predict the pregnancy rate result using a neural network. The method requires manual zooming of the embryo image and taking three photos with different focal lengths to predict the pregnancy rate. This method obviously has the problem of subjective recognition, which causes errors when each user takes photos with different focal lengths, making the results difficult to verify. The method of the present invention automatically detects abnormal data in medical records containing images, including data cleaning, data standardization and normalization, and automatically predicts. Users no longer need to manually adjust the acquired data during data prediction, and there is no longer the problem of human subjective judgment.

以上之前案技藝皆為針對提升試管嬰兒成功率之方法,但與本發明存在數個極大差異:兩案皆僅揭示以影像方法進行囊胚品質或懷孕率之預測,並未揭示使用病歷資料結合專家知識之預測方法;兩案皆未揭示可供使用者選擇所需參數納入分析之方法;兩案皆未揭示自動擬合(fitting)往後新增病歷資料之方法;兩案皆未揭示供使用者將資料集中之複數個參數分別設立複數個參數區間,進而調變(tuning)機率分布模型方法。 The above previous cases are all methods for improving the success rate of IVF, but there are several major differences with the present invention: both cases only disclose the prediction of blastocyst quality or pregnancy rate by imaging methods, and do not disclose the prediction method using medical records combined with expert knowledge; both cases do not disclose the method for users to select the required parameters for analysis; both cases do not disclose the method of automatically fitting the medical records added later; both cases do not disclose the method for users to set up multiple parameter intervals for multiple parameters in the data set, and then tune the probability distribution model.

本發明所採用之回溯性資料,係依各個機構之規定及紀錄者習慣,導致資料記載方式與記載內容皆不相同,為方便 應用於國內外各機構,本發明以一種異常數據自動偵測方法,依使用者設定,自動將回溯性資料進行整理,篩選出異常或離群值資料供該使用者進行確認或修改,供使用者進行二次確認。 The retrospective data used in this invention is based on the regulations of each institution and the habits of the recorder, resulting in different data recording methods and contents. In order to facilitate the application to various domestic and foreign institutions, this invention uses an abnormal data automatic detection method to automatically organize the retrospective data according to the user's settings, and select abnormal or outlier data for the user to confirm or modify, and for the user to confirm again.

本發明所採用之回溯性資料,其多為涉及商業機密或個人隱私之敏感性個資,為降低資料外洩風險,需盡可能降低使用者與軟體供應商間之資料傳輸頻率,爰此,本發明之特徵為在大數據分析或學習演算法訓練完成後,可透過一種專家知識參數化方法與一種異常數據自動偵測方法,確保資料品質及調變大數據分析或學習演算法訓練完成後所產生之機率分布模型,使其可擬合新增之資料,不需定期將回溯性資料交由軟體供應商進行處理,降低使用者作業負擔及資料往來之風險。 The retrospective data used in this invention are mostly sensitive personal data involving commercial secrets or personal privacy. In order to reduce the risk of data leakage, the frequency of data transmission between users and software suppliers must be reduced as much as possible. Therefore, the present invention is characterized in that after the big data analysis or learning algorithm training is completed, an expert knowledge parameterization method and an abnormal data automatic detection method can be used to ensure data quality and adjust the probability distribution model generated after the big data analysis or learning algorithm training is completed, so that it can fit the newly added data. There is no need to regularly hand over the retrospective data to the software supplier for processing, thereby reducing the user's workload and the risk of data exchange.

本發明之方法在實施時除可依據病歷資料與機率分布模型自動評估細胞著床率與該細胞品質外,還可根據使用者經驗,動態選定數種納入分析或排除之參數,並藉由大數據分析方法,可於預測結果中顯示各種輸入參數對於預測結果之影響性,藉事後歸因(post-hoc explanations)方式解讀預測結果,達到方法之可解釋性(Interpretability)部分。 In addition to automatically evaluating the cell implantation rate and the cell quality based on the medical record data and the probability distribution model, the method of the present invention can also dynamically select several parameters to be included in the analysis or excluded based on the user's experience. By using big data analysis methods, the influence of various input parameters on the prediction results can be displayed in the prediction results, and the prediction results can be interpreted by post-hoc explanations to achieve the interpretability of the method.

本發明之功效為:揭露一種細胞品質預測系統及其方法,其根據訓練資料集特性與標註方法,可應用於養殖及生技 醫療領域,利用回溯性資料預測未來事件發生機率;揭露一種專家知識參數化方法,大數據分析或學習演算法需揭高度依賴訓練資料集之資料標註品質,然而本發明之應用領域多為以經驗或肉眼判斷作為事件評估標準(專家知識),然經驗或肉眼判斷無法輸入演算法中進行運算,故需一將其文字化之方式,才可將其作為訓練資料,提高標註資料品質;可手動或自動調變機率分布模型,自動擬合新增之資料,除可降低將資料頻繁傳輸至軟體供應商重新訓練機率分布模型之資料外洩風險外,最顯著特徵為可依據使用者、國家地區及使用情境客製化機率分布模型;揭露一種異常數據自動偵測方法,自動篩選出異常或離群值資料,供使用者進行確認修改,提升資料品質。 The effects of the present invention are: to disclose a cell quality prediction system and method thereof, which can be applied to the fields of breeding and biotechnology and medicine based on the characteristics of the training data set and the annotation method, and to use retrospective data to predict the probability of future events; to disclose an expert knowledge parameterization method, and big data analysis or learning algorithms need to disclose the data annotation quality that is highly dependent on the training data set. However, the application fields of the present invention are mostly based on experience or naked eye judgment as the event evaluation standard (expert knowledge), but experience or naked eye judgment cannot be input into the algorithm for calculation. , so a way to convert it into text is needed to use it as training data to improve the quality of labeled data; the probability distribution model can be adjusted manually or automatically to automatically fit the newly added data. In addition to reducing the risk of data leakage caused by frequently transmitting data to software vendors to retrain the probability distribution model, the most notable feature is that the probability distribution model can be customized according to users, countries, regions and usage scenarios; an abnormal data automatic detection method is disclosed to automatically filter out abnormal or outlier data for users to confirm and modify, thereby improving data quality.

本發明之訓練資料集與其標註方法示例,若回溯性資料生殖醫學中之母體病歷參數或囊胚參數,則標註方式為:該筆資料是否懷孕;若回溯性資料是染色體檢驗所取得之資料,則標註方式為:該筆資料是否可能有特殊疾病及特殊疾病之名稱。 The training data set and its annotation method of the present invention are exemplified as follows: if the retrospective data is maternal medical history parameters or blastocyst parameters in reproductive medicine, the annotation method is: whether the data is pregnant; if the retrospective data is data obtained from chromosome testing, the annotation method is: whether the data may have a special disease and the name of the special disease.

本發明之以專家知識調變該結構化資料方法為,給定至少一機率分布模型,該使用者根據其專家知識將該訓練資料集中之複數個參數分別設立複數個參數區間,或將該訓練資料集中之特定參數排除,以進行該機率分布模型調變。 The method of using expert knowledge to adjust the structured data of the present invention is that, given at least one probability distribution model, the user sets a plurality of parameter intervals for a plurality of parameters in the training data set according to his expert knowledge, or excludes specific parameters in the training data set to adjust the probability distribution model.

在一個可行的實施例中(請參照圖一),本發明提出一細胞品 質預測系統、方法及其異常數據自動偵測方法(a),其包含一細胞載具(b)、透鏡模組(c)、病歷資料(d)、儲存單元(e)、運算單元(f)及使用者介面(g)。 In a feasible embodiment (please refer to FIG. 1), the present invention proposes a cell quality prediction system, method and abnormal data automatic detection method (a), which includes a cell carrier (b), a lens module (c), medical record data (d), a storage unit (e), a computing unit (f) and a user interface (g).

本發明中該細胞載具(b)可為試管或培養皿。 In the present invention, the cell carrier (b) can be a test tube or a culture dish.

本發明中該感測模組(c)係指可進行自動對焦之鏡頭模組,其對焦方式可為光學、數位或混合變焦,例如電子顯微鏡。 The sensing module (c) in the present invention refers to a lens module capable of autofocusing, and its focusing method can be optical, digital or hybrid zoom, such as an electronic microscope.

本發明中該病歷資料(d)為在試管嬰兒療程期間所取得之參數,以及胚胎成長過程由感測模組(c)所取得之影像,此處之參數包含但不限於臨床資料、年紀、BMI、女性荷爾蒙(E2)、胚胎著床前染色體篩檢(PGS/PGT-A)資料、胚胎著床前基因診斷(PGD/PGT-M)資料、染色體核型分析(karyotype analysis)資料、羊水晶片(Array Comparative Genomic Hybridization,aCGH)資料、非侵入性胎兒染色體基因檢測資料(NIFTY/NIPT/NIPS/niPGS/niPGT-A)、染色體數目、黃體成長激素(LH)、濾泡刺激素(FSH)、精子數量、精子活動力、精子型態、活動精子密度、活動精子總量、精液液化時間、pH值、抗穆勒氏管荷爾蒙(AMH)、子宮內膜厚度、卵泡數量、黃體素(P4)、植入方式、囊胚品質評級(Gardner Grading System)、乙型絨毛膜促性腺激素、單張圖片及TLI縮時影像(Time Lapse Incubator)。 The medical record data (d) in the present invention refers to the parameters obtained during the IVF treatment process and the images obtained by the sensing module (c) during the embryo growth process. The parameters here include but are not limited to clinical data, age, BMI, female hormone (E2), preimplantation chromosome screening (PGS/PGT-A) data, preimplantation genetic diagnosis (PGD/PGT-M) data, chromosome karyotype analysis data, amniotic fluid chip (Array Comparative Genomic Hybridization, aCGH) data, non-invasive fetal chromosome gene detection data (NIFTY/NIPT/NIPS/niPGS/niPGT-A), chromosome number, luteinizing hormone (LH), follicle stimulating hormone (FSH), sperm count, sperm motility, sperm morphology, active sperm density, total active sperm, semen liquefaction time, pH value, anti-Mullerian hormone (AMH), endometrial thickness, number of follicles, progesterone (P4), implantation method, blastocyst quality rating (Gardner Grading System), beta-chorionic gonadotropin, single picture and TLI time-lapse image (Time Lapse Incubator).

前述參數可依據使用者之需求進行動態之新增與刪除,並 自動進行資料清洗(Data Cleansing)、資料進行標準化(Standardization)及正規化(Normalization)。 The above parameters can be dynamically added and deleted according to user needs, and automatically perform data cleaning, data standardization and normalization.

資料清洗方法至少包含:篩選出重複資料、不相關資料、極端值、雜訊、缺失資料或結構錯誤資料,並將其進行刪除、補值、取前後值平均、以近似值填補。 Data cleaning methods at least include: screening out duplicate data, irrelevant data, extreme values, noise, missing data or structurally incorrect data, and deleting, filling in values, taking the average of previous and subsequent values, and filling in with approximate values.

資料標準化方法至少包含:極小化極大演算法(Max-Min)、標準分數(Z-Score)、最大值絕對值標準化(MaxAbs)、Robust Scaler、平均值(Means)、標準差(Standard Deviation)以及會使資料處於0至1間小數之演算法 Data standardization methods include at least: Minimization and maximization algorithm (Max-Min), standard score (Z-Score), maximum absolute value standardization (MaxAbs), Robust Scaler, mean (Means), standard deviation (Standard Deviation) and algorithms that make the data decimal between 0 and 1

資料正規化方法至少包含Label Encoding、One-hot Encoding以及會使資料之平均值為0且標準差為1之演算法,供資訊工程師或資料科學家(data scientist)進行大數據分析(Big data analysis)或學習演算法訓練學習,以獲得最佳化之機率分布模型。 Data normalization methods at least include Label Encoding, One-hot Encoding, and algorithms that make the mean value of the data 0 and the standard deviation 1, which are used by information engineers or data scientists to perform big data analysis or training learning algorithms to obtain optimized probability distribution models.

本發明中該儲存單元(e),至少包含硬碟(HDD)、固態硬碟(SSD)以及記憶體(RAM)。 The storage unit (e) in the present invention includes at least a hard disk (HDD), a solid state hard disk (SSD) and a memory (RAM).

本發明中該運算單元(f),至少包含中央處理單元(CPU)、繪圖處理單元(GPU)及加速處理單元(APU)。 The computing unit (f) in the present invention includes at least a central processing unit (CPU), a graphics processing unit (GPU) and an accelerated processing unit (APU).

本發明中該儲存單元€與運算單元(f)可搭載於個人電腦、工業電腦、電腦叢集、雲端運算、筆記型電腦、手機或邊緣運算設備其中之一或其組合。 The storage unit € and computing unit (f) in the present invention can be mounted on one or a combination of a personal computer, an industrial computer, a computer cluster, cloud computing, a laptop computer, a mobile phone or an edge computing device.

本發明中提及之大數據分析與學習演算法包含:逆傳遞演 算法(Backpropagation)、監督式學習(Supervised learning)、半監督式學習(Semi-supervised learning)、集成學習(Ensemble learning)、主動學習(Active learning)、強化式學習(Reinforcement learning)、生成式模型(Generative Model)、判別式模型(Discriminative Model)、長短期記億(Long Short-Term Memory)、物件偵測(Object Detection)、實例分割(Instance Segmentation)及擴散模型(Diffusion model)。 The big data analysis and learning algorithms mentioned in this invention include: Backpropagation, supervised learning, semi-supervised learning, ensemble learning, active learning, reinforcement learning, generative model, discriminative model, long short-term memory, object detection, instance segmentation and diffusion model.

前述提及之生成式模型,其至少包含高斯混合模型(Gaussian Mixture Model)、最大似然率估計(Maximum Likelihood Estimation)、隱馬爾可夫模型(Hidden Markov Model)、樸素貝氏分類器(Naive Bayes classifier)。 The generative models mentioned above include at least Gaussian Mixture Model, Maximum Likelihood Estimation, Hidden Markov Model, and Naive Bayes classifier.

前述提及之判別式模型,其至少包含邏輯回歸(Logistic regression)、線性回歸(Linear regression)、支持向量機(Support Vector Machine)、決策樹(Decision Tree)、極限梯度提升(eXtreme Gradient Boosting)。 The discriminant model mentioned above includes at least logistic regression, linear regression, support vector machine, decision tree, and eXtreme Gradient Boosting.

本發明之異常數據自動偵測方法係利用前述之資料清洗方法與資料正規化方法自動篩檢異常或離群值資料,並配合下列方法,達到自動擬合原始資料、降低資料維度、加速運算效能、降低運算設備需求及提升系統可解釋性等功能,其至少包含基因演算法(Genetic Algorithm)、資料分群(Clustering)、影像處理、邊緣偵測(edge detection)、卡方檢 定(Chi-square test)、EM演算法(Expectation-Maximization Algorithm)。 The abnormal data automatic detection method of the present invention utilizes the aforementioned data cleaning method and data normalization method to automatically screen abnormal or outlier data, and cooperates with the following methods to achieve the functions of automatically fitting the original data, reducing the data dimension, accelerating the computing performance, reducing the computing equipment requirements and improving the system interpretability, which at least includes genetic algorithm (Genetic Algorithm), data clustering (Clustering), image processing, edge detection (edge detection), Chi-square test (Chi-square test), EM algorithm (Expectation-Maximization Algorithm).

本發明提供一種細胞品質預測系統、方法及其異常數據自動偵測方法,至少包含下列步驟: The present invention provides a cell quality prediction system, method and abnormal data automatic detection method, which at least includes the following steps:

(I)由使用者提供回溯性資料,此處使用者可為相關領域從業人員或具相關執照之人士。 (I) Retrospective data is provided by users, who can be practitioners in related fields or persons with relevant licenses.

(Ⅱ)由資訊工程師或資料科學家依據回溯性資料特性,擬定異常數據自動偵測方法。 (Ⅱ) Information engineers or data scientists formulate automatic detection methods for abnormal data based on the characteristics of retrospective data.

(Ⅲ)將該回溯性資料進行異常數據自動偵測,並藉相關領域專家進行數據確認、修改及資料標註,以形成一結構化資料及大數據分析或學習演算法學習訓練用之訓練資料集。 (III) Automatically detect abnormal data in the retrospective data, and use experts in related fields to confirm, modify and annotate the data to form a training data set for structured data and big data analysis or learning algorithm training.

(Ⅲ)將該訓練資料集輸入前述提及之大數據分析或學習演算法,進行學習訓練,得出一第一機率分布模型及第一預測模型,利用該第一預測模型及該第一機率分布模型算出一第一預測結果。 (III) Input the training data set into the aforementioned big data analysis or learning algorithm to perform learning training, obtain a first probability distribution model and a first prediction model, and use the first prediction model and the first probability distribution model to calculate a first prediction result.

(Ⅳ)利用步驟(Ⅲ)中,由該異常數據自動偵測方法產生之該結構化資料,或由該使用者根據其專家知識將該訓練資料集中之複數個參數分別設立複數個參數區間,或將該訓練資料集中之特定參數排除,藉該使用者設立之該參數區間範圍進行該機率分布模型調變,使抽象之專家知識作為參數納入演算法,用以調變步驟(Ⅲ)中之第一機率分布模型,產生第二機率分布模型與第二預測模型。 (IV) Using the structured data generated by the anomaly data automatic detection method in step (III), or the user setting multiple parameter intervals for multiple parameters in the training data set according to his expert knowledge, or excluding specific parameters in the training data set, the probability distribution model is adjusted within the parameter interval range set by the user, so that the abstract expert knowledge is incorporated into the algorithm as a parameter to adjust the first probability distribution model in step (III) to generate a second probability distribution model and a second prediction model.

(V)將步驟(I)中之該回溯性資料輸入至步驟(Ⅳ)中之該第二預測模型與該第二機率分布模型,以得出第二預測結果,該第二預測結果至少包含細胞著床率、該細胞品質及各種輸入參數對於預測結果之影響性,其中輸入至該第二預測模型與該第二機率分布模型之參數,可於步驟(Ⅳ)中由該使用者進行挑選。 (V) Inputting the retrospective data in step (I) into the second prediction model and the second probability distribution model in step (IV) to obtain a second prediction result, which at least includes the cell implantation rate, the cell quality and the influence of various input parameters on the prediction result, wherein the parameters input into the second prediction model and the second probability distribution model can be selected by the user in step (IV).

步驟(Ⅳ)中之具結構化資料,其含具對應關係之文字資料、數字資料、影像資料、時序資料、胚胎影像、資料分群結果、系統自動或由使用者定義複數個參數之複數個參數區間。 The structured data in step (IV) includes text data, digital data, image data, time series data, embryo images, data grouping results, and multiple parameter intervals of multiple parameters automatically defined by the system or by the user.

步驟中(V)之預測結果具複數預測輸出值,包含:細胞著床率預測、細胞生長時間點、物件偵測、實例分割、物件計數、囊胚透明帶厚度、滋養層細胞數量、滋養層細胞總面積占比、內細胞群總面積占比、細胞大小、細胞分裂均勻程度及細胞碎片化程度及細胞品質評估,其中細胞品質評估方法採Gardner分級系統(Gardner blastocyst grading system)。 The prediction results in step (V) have multiple prediction output values, including: cell implantation rate prediction, cell growth time point, object detection, instance segmentation, object counting, blastocyst zona pellucida thickness, trophoblast cell number, trophoblast cell total area ratio, inner cell mass total area ratio, cell size, cell division uniformity and cell fragmentation degree, and cell quality assessment, where the cell quality assessment method adopts the Gardner blastocyst grading system.

該複數預測輸出值可藉由串接複數個單張影像之該預測輸出值為一時序性資料。 The multiple predicted output values can be obtained by concatenating the predicted output values of multiple single images into a time series data.

本發明之細胞著床率,係利用大數據分析或學習演算法得出之細胞生長時間點、物件偵測、實例分割、物件計數、囊胚透明帶厚度、滋養層細胞數量、滋養層細胞總面積占 比、內細胞群總面積占比、細胞大小、細胞分裂均勻程度及細胞碎片化程度結果進行運算以分析得出。 The cell implantation rate of the present invention is obtained by analyzing the cell growth time point, object detection, instance segmentation, object counting, blastocyst zona pellucida thickness, trophoblast cell number, trophoblast cell total area ratio, inner cell mass total area ratio, cell size, cell division uniformity and cell fragmentation degree obtained by big data analysis or learning algorithms.

本發明之自動擬合往後新增資料之方式為:當有新資料產生時,自動將該筆資料進行步驟(Ⅲ)之異常數據自動偵測,再交由該使用者進行確認、修改及標註工作,再利用其修改步驟(Ⅳ)中之結構化資料,進而調變該第二機率分布模型。 The method of automatically fitting the newly added data in the present invention is: when new data is generated, the data is automatically detected as abnormal data in step (III), and then handed over to the user for confirmation, modification and annotation, and then the structured data in the modification step (IV) is used to adjust the second probability distribution model.

系統自動或由使用者定義複數個參數之複數個參數區間之示例為:若該訓練資料集為水產養殖相關,則參數區間可為合適之水溫、pH值或溶氧度範圍;若該訓練資料集為生殖醫學相關,則參數區間可為合適之年紀、BMI或AMH範圍,其中容許有複數個範圍存在,例如將年紀定義為[18-24]及[30-40]兩範圍。 Examples of multiple parameter ranges for multiple parameters defined automatically by the system or by the user are: if the training data set is related to aquaculture, the parameter range can be an appropriate range of water temperature, pH value or dissolved oxygen; if the training data set is related to reproductive medicine, the parameter range can be an appropriate range of age, BMI or AMH, where multiple ranges are allowed, for example, age is defined as [18-24] and [30-40].

將第一機率分布模型及第一預測模型調變為第二機率分布模型及第二預測模型之示例為:若該機率分布模型為隱馬爾可夫模型,則係利用該參數區間調變馬可夫鏈(Markov chain)中各項狀態之機率值;若該機率分布模型為貝氏分類器,則係利用該參數區間調變其最大似然率估計。 An example of modifying the first probability distribution model and the first prediction model into the second probability distribution model and the second prediction model is: if the probability distribution model is a hidden Markov model, the probability value of each state in the Markov chain is modified by the parameter interval; if the probability distribution model is a Bayesian classifier, the maximum likelihood estimate is modified by the parameter interval.

本發明將依據前述提及之大數據分析及學習演算法,分別產生複數個機率分布模型,而後再以獨立顯示或集成學習之方式呈現預測結果。 This invention will generate multiple probability distribution models based on the aforementioned big data analysis and learning algorithms, and then present the prediction results in an independent display or integrated learning manner.

集成學習方法至少包含由複數個演算法產生之結果進行平 均、取最大值、取最小值或給定一閥值使結果二值化,進行最終結果投票方法其中之一或其組合。 The ensemble learning method at least includes averaging, taking the maximum value, taking the minimum value, or giving a valve value to binarize the results generated by multiple algorithms, and performing one or a combination of voting methods for the final result.

本發明之細胞品質預測系統,具一使用者介面,其功效為供使用者進行操作並於該使用者介面中呈現系統執行結果,為可搭載並運算複數之學習演算法,可自動依據該學習演算法或該硬體設備之規格,調整於該系統運行之機率分布模型大小,以最佳化系統運算效能。 The cell quality prediction system of the present invention has a user interface, which is used for users to operate and present the system execution results in the user interface. It can load and calculate multiple learning algorithms, and can automatically adjust the size of the probability distribution model running in the system according to the specifications of the learning algorithm or the hardware device to optimize the system computing performance.

該使用者介面可為純文字介面或圖形化介面。 The user interface can be a text-only interface or a graphical interface.

該使用者介面提供之操作包含:該使用者介面之前後台切換、使用者登入、使用者登出、使用者資訊修改、輸入文字、選擇檔案上傳、選定或排除輸入至學習演算法之參數,以專家知識將訓練資料集中之複數個參數分別設立複數個參數區間。 The operations provided by the user interface include: switching between the foreground and background of the user interface, user login, user logout, modifying user information, inputting text, selecting files to upload, selecting or excluding parameters to be input into the learning algorithm, and using expert knowledge to set multiple parameter ranges for multiple parameters in the training data set.

該使用者介面之系統執行結果包含:訓練資料集、學習演算法之預測結果、視覺化資料、聲音資料、說明文字或數字資料其中之一或其組合。 The system execution results of the user interface include: training data sets, prediction results of learning algorithms, visual data, sound data, descriptive text or digital data, or a combination thereof.

a:細胞品質預測系統、方法及其異常數據自動偵測方法 a: Cell quality prediction system, method and automatic detection method of abnormal data

b:細胞載具 b: Cell carrier

c:感測模組 c:Sensor module

d:原始資料 d: Original data

e:儲存單元 e: Storage unit

f:運算單元 f: Operational unit

g:使用者介面 g: User interface

g1:核取方塊元件之組合 g1: Check the combination of block components

g2:文字標籤元件與輸入欄位元件之組合 g2: Combination of text label component and input field component

g3:影像顯示區塊 g3: Image display area

g4:複數個預測方法之結果比較 g4: Comparison of results of multiple prediction methods

g5:資訊顯示區塊 g5: Information display area

g6:複數個預測結果之詳細資訊 g6: Detailed information of multiple prediction results

g7:按鈕元件 g7: button component

g8:按鈕元件 g8: button component

g9:文字標籤元件 g9: Text label component

g10:滑條元件 g10: Slider component

h1:訓練資料集中之資料點 h1: Data points in the training data set

h2:原始機率分布模型之涵蓋範圍 h2: Coverage of the original probability distribution model

h3:新增之資料點 h3: Newly added data points

h4:調變後之機率分布模型 h4: Modulated probability distribution model

i1:A使用者專家知識認為適當數值範圍 i1:A The user’s expert knowledge considers the appropriate numerical range

i2:調變前之PDF i2: PDF before modulation

i3:B使用者專家知識認為適當數值範圍 i3:B The user’s expert knowledge considers the appropriate numerical range

i4:調變後之PDF i4: Modulated PDF

第一圖為本發明細胞品質預測系統、方法及其異常數據自動偵測方法之系統方塊圖 The first figure is a system block diagram of the cell quality prediction system, method and abnormal data automatic detection method of the present invention

第二圖為本發明細胞品質預測系統、方法及其異常數據自動偵測方法之前台介面示意圖 The second figure is a schematic diagram of the front-end interface of the cell quality prediction system, method and abnormal data automatic detection method of the present invention.

第三圖為本發明細胞品質預測系統、方法及其異常數據自 動偵測方法之資訊介面放大檢視示意圖 The third figure is a schematic diagram of the enlarged view of the information interface of the cell quality prediction system, method and abnormal data automatic detection method of the present invention.

第四圖為本發明細胞品質預測系統、方法及其異常數據自動偵測方法之預測結果介面放大檢視圖 The fourth figure is an enlarged view of the prediction result interface of the cell quality prediction system, method and abnormal data automatic detection method of the present invention

第四圖為本發明細胞品質預測系統、方法及其異常數據自動偵測方法之後台介面示意圖 The fourth figure is a schematic diagram of the background interface of the cell quality prediction system, method and abnormal data automatic detection method of the present invention

細胞品質預測系統、方法及其異常數據自動偵測方法 Cell quality prediction system, method and abnormal data automatic detection method

請參閱第一圖,使用者利用感測模組(c),進行手動以電子顯微鏡拍攝,或以胚胎縮時攝影監控培養箱(Time-lapse Incubator)進行每隔固定時間拍攝所組成之縮時攝影,拍攝於細胞載具(b)中培養之胚胎。 Please refer to the first figure. The user uses the sensing module (c) to manually take pictures with an electron microscope, or uses the embryo time-lapse monitoring incubator to take pictures at fixed intervals to take time-lapse pictures of the embryos cultured in the cell carrier (b).

在進行胚胎發育至囊胚期(通常為120小時)後,由使用者於使用者介面(g)進行操作,系統便會自動將前述所拍攝之影像與病歷資料組成一回溯性資料(原始資料),先經過異常數據自動偵測後,將該偵測結果供使用者進行確認或修改,存放至一儲存單元(e),而後以運算單元(f)進行運算,得出預測結果。 After the embryo develops to the blastocyst stage (usually 120 hours), the user operates on the user interface (g), and the system automatically combines the aforementioned images and medical records into a retrospective data (original data). After the abnormal data is automatically detected, the detection result is provided for the user to confirm or modify, and is stored in a storage unit (e). Then, the calculation unit (f) performs calculations to obtain the predicted result.

若所拍攝之影像為影像單張圖片(image),至少可得出單張圖片中之胚胎資訊,包含胚胎中之滋養層細胞(TE)及內細胞群(ICM)之數量及面積占比及胚胎透明帶(Zona pellucida)厚度。 If the image is a single image, at least the embryo information in the single image can be obtained, including the number and area ratio of the trophoblast cells (TE) and inner cell mass (ICM) in the embryo and the thickness of the embryonic zona pellucida.

若所拍攝之影像為連續性影像(video),如縮時影像,除前 述之資訊外,至少可得出胚胎生長過程中產生變化之時間點,此處包含但不限於:卵裂至二到八細胞時間點(t2~t8)、受精至桑葚胚時間點(tM)、受精至早期囊胚、囊胚及擴張囊胚時間點(tSB,tB,tEB)。 If the image taken is a continuous image (video), such as a time-lapse image, in addition to the aforementioned information, at least the time points of changes in the embryonic growth process can be obtained, including but not limited to: the time point from cleavage to two to eight cells (t2~t8), the time point from fertilization to morula (tM), the time point from fertilization to early blastocyst, blastocyst and expanded blastocyst (tSB, tB, tEB).

上述結果可於使用者介面(g)中呈現,及供使用者參考,並且使用者可自行以文字或語音輸入方式進行更改,若經使用者確認資訊無誤,將連同原始資訊及使用者所更改之資訊自動回存至儲存單元(e),供往後回溯之用。 The above results can be presented in the user interface (g) for user reference, and users can make changes by text or voice input. If the user confirms that the information is correct, it will be automatically saved to the storage unit (e) together with the original information and the information changed by the user for future reference.

第二圖至第五圖中之虛線處為不主張設計之部分,其餘為主張設計之部分,惟外觀與環境間之位置、大小及分布關係不屬之。 The dotted lines in the second to fifth figures are the parts that are not designed, and the rest are the parts that are designed, but the position, size and distribution relationship between the appearance and the environment are not included.

請參閱第二圖,其中編號(g)代表使用者介面,其中編號(g1)為表單元件中核取方塊(checkbox)之組合;其中編號(g2)為表單元件中之文字標籤(label)與輸入欄位(input)之組合;其中編號(g3)為影像顯示區塊;其中編號(g4)為複數個預測方法之結果比較;其中編號(g5)為資訊顯示區塊;其中編號(g6)為複數個預測結果之詳細資訊;其中編號(g7)及(g8)為一按鈕元件。 Please refer to the second figure, where number (g) represents the user interface, number (g1) is a combination of checkboxes in a form component; number (g2) is a combination of a text label and an input field in a form component; number (g3) is an image display area; number (g4) is a comparison of the results of multiple prediction methods; number (g5) is an information display area; number (g6) is detailed information of multiple prediction results; and numbers (g7) and (g8) are button components.

請參閱第二圖,其中編號(g2)之元件為不主張設計之部分,其僅為此實施例示意之,可由任何元件替換之,但至少需包含文字顯示及文字輸入功能之。 Please refer to the second figure, where the component numbered (g2) is not a design advocated part. It is only used for illustration of this embodiment and can be replaced by any component, but it must at least include text display and text input functions.

請參閱第二圖及第三圖,其中編號(g4)之圖形為不主張設計 之部分,其僅為此實施例示意以視覺化方式顯示複數個相異資訊之用,其可變換為但不限於:長條圖、圓餅圖、折線圖及面積圖。 Please refer to the second and third figures, where the figure numbered (g4) is not a part of the design advocated and is only used to illustrate the visual display of multiple different information in this embodiment. It can be transformed into but not limited to: a bar chart, a pie chart, a line chart and an area chart.

請參閱第二圖,其中編號(g7)之元件為不主張設計之部分,其僅為此實施例示意之,可由任何具有點擊事件(click event)功能元件替換之。 Please refer to the second figure, where the component numbered (g7) is not a design part, it is only used for illustration of this embodiment and can be replaced by any component with a click event function.

請參閱第二圖至第五圖,其中編號(g8)之元件為不主張設計之部分,其僅為此實施例示意之,可由任何具有點擊事件(click event)功能元件替換之。 Please refer to the second to fifth figures, where the component numbered (g8) is not a design part, which is only used for illustration of this embodiment and can be replaced by any component with a click event function.

本案之細胞品質預測系統、方法及其異常數據自動偵測方法,其使用者介面(g)之操作方式,請參閱第二圖,使用者可於輸入欄位(g2)中輸入病歷號,若輸入之病歷號資料已建檔,系統便會自動顯示其他參數至(g2)之其於欄位,而後使用者可於核取方塊(g1)選取欲參與或排除計算之參數,而後點擊按鈕(g7)進行計算。 Please refer to the second figure for the operation of the user interface (g) of the cell quality prediction system and method and its abnormal data automatic detection method in this case. The user can enter the medical record number in the input field (g2). If the medical record number data entered has been archived, the system will automatically display other parameters in the fields of (g2). Then the user can select the parameters to be included or excluded in the calculation in the check box (g1), and then click the button (g7) to perform the calculation.

若輸入之病歷號未建檔,使用者可手動填入其他資訊至輸入欄位(g2)中之其他欄位,並點擊按鈕(g7)進行運算。 If the entered medical record number has not been created, the user can manually enter other information into other fields in the input field (g2) and click the button (g7) to perform the calculation.

點擊按鈕(g7)後之計算結果將呈現於影像顯示區塊(g3)、複數個預測方法之結果比較(g4)、資訊顯示區塊(g5)及複數個預測結果之詳細資訊(g6)。 After clicking the button (g7), the calculation results will be displayed in the image display area (g3), the result comparison of multiple prediction methods (g4), the information display area (g5) and the detailed information of multiple prediction results (g6).

複數個預測結果之詳細資訊(g6),至少包含由複數個機率分布模型之預測結果(g4)以及於輸入欄位(g2)中各參數,其數 值之所在區間,以及預測結果之可能原因。 Detailed information (g6) of multiple prediction results includes at least the prediction results (g4) of multiple probability distribution models and the range of values of each parameter in the input field (g2), as well as possible reasons for the prediction results.

請參閱第三圖,其為點擊第二圖中之影像顯示區塊(g3),使第二圖之影像顯示區塊(g3)及資訊顯示區塊(g5)獨立放大,使得使用者介面(g)產生外觀變化之設計,此功能可供使用者更清楚檢視多媒體資訊。 Please refer to the third picture, which is a design in which the image display block (g3) and the information display block (g5) in the second picture are enlarged independently by clicking on the image display block (g3) in the second picture, so that the appearance of the user interface (g) changes. This function allows users to view multimedia information more clearly.

請參閱第四圖、其為點擊第二圖中之複數個預測方法之結果比較(g4),使得第二圖之複數個預測方法之結果比較(g4)及複數個預測結果之詳細資訊(g6)獨立放大,使得使用者介面(g)產生外觀變化之設計,此功能可供使用者更清楚檢視預測結果、輸入參數分布區間及預測結果可能之原因。 Please refer to the fourth figure, which is a design that clicks on the result comparison of multiple prediction methods (g4) in the second figure, so that the result comparison of multiple prediction methods (g4) and the detailed information of multiple prediction results (g6) in the second figure are enlarged independently, causing the appearance of the user interface (g) to change. This function allows users to more clearly view the prediction results, the distribution range of input parameters and the possible reasons for the prediction results.

請參閱第五圖,其為於使用者介面(g)中點擊按鈕元件(g8),使得使用者介面(g)於第二圖及第五圖之間產生外觀變化之設計,此功能可供使用者切換至後台設定各項參數之區間。 Please refer to the fifth figure, which shows a design in which the appearance of the user interface (g) changes between the second figure and the fifth figure by clicking the button component (g8) in the user interface (g). This function allows the user to switch to the area where various parameters are set in the background.

請參閱第五圖,其中編號(g)為使用者介面;其中編號(g8)為一按鈕元件;其中編號(g9)為表單元件中之文字標籤(label);其中編號(g10)為表單中之滑條(slider)元件。 Please refer to the fifth figure, in which number (g) is the user interface; number (g8) is a button component; number (g9) is the text label (label) in the form component; number (g10) is the slider component in the form.

請參閱第五圖,其中編號(g9)之元件為不主張設計之部分,其僅為此實施例示意之,可由任何具文字顯示功能之元件替換之。 Please refer to the fifth figure, where the component numbered (g9) is not a design part, which is only used for illustration of this embodiment and can be replaced by any component with text display function.

請參閱第五圖,其中編號(g10)之元件為不主張設計之部分,其僅為此實施例示意之,可由任何可輸入或設定數值 之元件替換之。 Please refer to the fifth figure, where the component numbered (g10) is not a design part, it is only used for illustration of this embodiment and can be replaced by any component that can input or set a value.

本案之細胞品質預測系統、方法及其異常數據自動偵測方法,其使用者介面(g)其後台頁面之操作方式,請參閱第五圖,其中文字標籤元件(g9)將顯示各參數之名稱,使用者可於滑條元件(g10)將參數之數值設定為複數個區間,其各區間皆可由使用者自行定義,如定義某參數在某區間代表狀態良好、適中或較差等,使用者設定完成後可再點擊按鈕元件(g8)切換至前台頁面,再點擊按鈕元件(g7)執行運算,該系統便會根據使用者調整之區間範圍反應最終預測結果並呈現於使用者介面(g)。 Please refer to Figure 5 for the cell quality prediction system, method and abnormal data automatic detection method of this case, its user interface (g) and the operation method of its background page. The text label component (g9) will display the name of each parameter. The user can set the parameter value to multiple ranges in the slider component (g10). Each range can be defined by the user. For example, a parameter in a certain range represents a good, moderate or poor state. After the user completes the setting, he can click the button component (g8) to switch to the front page, and then click the button component (g7) to execute the operation. The system will respond to the final prediction result according to the range adjusted by the user and present it on the user interface (g).

機率分布模型調變方式示例 Example of how to adjust the probability distribution model

請參閱第六圖,其為機率分布模型調變之方式,圖中(h1)為訓練資料集中之資料點、(h2)為原始機率分布模型之涵蓋範圍、(h3)為新增之資料點、(h4)為調變後之機率分布模型,其實施方式係藉由改變機率分布模型之涵蓋範圍,以涵蓋新增之資料點。 Please refer to Figure 6, which shows the method of adjusting the probability distribution model. In the figure, (h1) is the data point in the training data set, (h2) is the coverage range of the original probability distribution model, (h3) is the newly added data point, and (h4) is the adjusted probability distribution model. The implementation method is to change the coverage range of the probability distribution model to cover the newly added data points.

請參閱第七圖,以機率分布函數(Probability Distribution Function,PDF)進行舉例單一參數之簡單舉例,其中橫軸x為參數之數值、縱軸y為事件發生機率、(i1)為A使用者專家知識認為適當數值範圍、(i2)為調變前之PDF、(i3)為B使用者專家知識認為適當數值範圍、(i4)為調變後之PDF;若使用者A專家知識認為適合參數範圍區間為[1,10],訓練出之 PDF為(i2),另一使用者B專家知識認為適合參數範圍區間為[6,20],此時僅需將系統中之參數範圍改為[6,20],系統便會調整平均值與標準差進行擬合,將調整前之PDF(i2),自動調變為(i4),以達根據不同使用者客製機率分布模型之功效。 Please refer to Figure 7, which uses the Probability Distribution Function (PDF) to illustrate a simple example of a single parameter, where the horizontal axis x is the value of the parameter, the vertical axis y is the probability of the event, (i1) is the range of values considered appropriate by user A’s expert knowledge, (i2) is the PDF before the adjustment, (i3) is the range of values considered appropriate by user B’s expert knowledge, and (i4) is the PDF after the adjustment. If user A’s expert knowledge considers the parameter range appropriate, The interval is [1,10], and the trained PDF is (i2). Another user B, an expert, believes that the appropriate parameter range is [6,20]. At this time, it is only necessary to change the parameter range in the system to [6,20], and the system will adjust the mean and standard deviation for fitting, and automatically adjust the PDF (i2) before adjustment to (i4), so as to achieve the effect of customizing the probability distribution model according to different users.

本創作之保護範圍應不限於實施例所揭示者,而應包括各種不背離本創作之替換及修飾,並為以下之申請專利範圍所涵蓋。 The protection scope of this creation should not be limited to what is disclosed in the embodiments, but should include various substitutions and modifications that do not deviate from this creation, and are covered by the following patent application scope.

a:細胞品質預測系統、方法及其異常數據自動偵測方法 a: Cell quality prediction system, method and automatic detection method of abnormal data

b:細胞載具 b: Cell carrier

c:感測模組 c:Sensor module

d:原始資料 d: Original data

e:儲存單元 e: Storage unit

f:運算單元 f: Operational unit

g:使用者介面 g: User interface

Claims (26)

一種細胞品質預測方法,係將至少一病患之病歷資料進行異常數據自動偵測,產生一訓練資料集及一結構化資料,再將該訓練資料集以一學習演算法運算,產生一第一機率分布模型及一第一預測模型,利用該第一預測模型及該第一機率分布模型算出第一預測結果,由具專家知識之該結構化資料對其進行調變,得出一第二機率分布模型及一第二預測模型,以該利用該第二預測模型及該第二機率分布模型算出第二預測結果,該發明特徵在於:一具專家知識之使用者可由該結構化資料中,選擇欲納入或排除運算之參數,及調變該結構化資料,以間接調變該第一機率分布模型與該第一預測模型;其中調變該結構化資料係由該使用者根據其專家知識將該訓練資料集中之複數個參數分別設立複數個參數區間或將該訓練資料集中之特定參數排除,其中該各區間皆可由使用者自行定義及調整其區間之範圍。 A method for predicting cell quality is provided, wherein abnormal data of at least one patient's medical history data is automatically detected to generate a training data set and a structured data, and then the training data set is operated by a learning algorithm to generate a first probability distribution model and a first prediction model, and the first prediction model and the first probability distribution model are used to calculate a first prediction result, and the structured data with expert knowledge is modulated to obtain a second probability distribution model and a second prediction model, and the second prediction model and the second probability distribution model are used to calculate a first prediction result. The invention is characterized in that a user with expert knowledge can select parameters to be included or excluded from the structured data, and adjust the structured data to indirectly adjust the first probability distribution model and the first prediction model; wherein the structured data is adjusted by the user to set a plurality of parameter intervals for a plurality of parameters in the training data set or to exclude specific parameters in the training data set according to his expert knowledge, wherein each interval can be defined and adjusted by the user. 如請求項第1項所述之細胞品質預測方法,其中該病歷資料,至少包含年紀、BMI、女性荷爾蒙、黃體成長激素、濾泡刺激素、精液液化時間、pH值、抗穆勒氏管荷爾蒙、子宮內膜厚度、卵泡數量、黃體素、植入方式、囊胚品質評級、乙型絨毛膜促性腺激素、精子活動力、活動精子總量及精液液化時間其中之一或其組合。 The method for predicting cell quality as described in claim 1, wherein the medical history data includes at least one or a combination of age, BMI, female hormone, luteinizing hormone, follicle stimulating hormone, semen liquefaction time, pH value, anti-Mullerian hormone, endometrial thickness, number of follicles, progesterone, implantation method, blastocyst quality rating, beta-chorionic gonadotropin, sperm motility, total amount of active sperm and semen liquefaction time. 如請求項第1項所述之細胞品質預測方法,其中該學習演算法至少包含逆傳遞演算法、大數據分析、影像分類、物件偵測、生成式模型、判別式模型、長短期記憶、強化學習、生成型預訓練變換模型及集成學習其中之一或其組合。 The cell quality prediction method as described in claim 1, wherein the learning algorithm comprises at least one of back propagation algorithm, big data analysis, image classification, object detection, generative model, discriminative model, long short-term memory, reinforcement learning, generative pre-trained transformation model and ensemble learning or a combination thereof. 如請求項第3項所述之細胞品質預測方法,其中該生成式模型至少包含高斯混合模型、最大似然率估計、隱馬爾可夫模型、樸素貝氏分類器、可利用回溯性資料計算往後事件發生機率之演算法其中之一或其組合。 The cell quality prediction method as described in claim 3, wherein the generative model includes at least one of a Gaussian mixture model, a maximum likelihood estimation, a hidden Markov model, a simple Bayesian classifier, and an algorithm that can use retrospective data to calculate the probability of future events, or a combination thereof. 如請求項第3項所述之細胞品質預測方法,其中該判別式模型至少包含邏輯迴歸、線性迴歸、支持向量機、人工神經網路、決策樹、極限梯度提升其中之一或其組合。 The cell quality prediction method as described in claim 3, wherein the discriminant model comprises at least one of logical regression, linear regression, support vector machine, artificial neural network, decision tree, and extreme gradient boosting, or a combination thereof. 如請求項第3項所述之細胞品質預測方法,其中該集成學習至少包含由複數個演算法產生之結果進行平均、取最大值、取最小值或給定一閥值使結果二值化,進行最終結果投票方法其中之一或其組合。 The cell quality prediction method as described in claim 3, wherein the ensemble learning at least includes averaging, taking the maximum value, taking the minimum value, or giving a threshold value to binarize the results generated by multiple algorithms, and performing one or a combination of the voting methods for the final result. 如請求項第1項所述之細胞品質預測方法,其中該使用者至少包含相關領域從業人員、具相關執照之人士其中之一或其組合。 The cell quality prediction method as described in claim 1, wherein the user includes at least one of practitioners in the relevant field and persons with relevant licenses, or a combination thereof. 如請求項第1項所述之細胞品質預測方法,其中該結構化資料至少包含具對應關係之文字資料、數字資料、影像資料、時序資料、胚胎影像、資料分群結果其中之一或其組合。 The cell quality prediction method as described in claim 1, wherein the structured data at least includes one or a combination of text data, digital data, image data, time series data, embryo images, and data grouping results with corresponding relationships. 如請求項第1項所述之細胞品質預測方法,其中該第二預測結果至少包含細胞著床率、該細胞品質、各種輸入參數對於該預測結果之影響性其中之一或其組合。 The cell quality prediction method as described in claim 1, wherein the second prediction result includes at least one of the cell implantation rate, the cell quality, and the influence of various input parameters on the prediction result, or a combination thereof. 如請求項第9項所述之細胞品質預測方法,其中該細胞著床率,係利用該學習演算法對該病歷資料進行預測,得出複數個預測輸出值並將其分析所得出。 The cell quality prediction method as described in claim item 9, wherein the cell implantation rate is obtained by using the learning algorithm to predict the medical record data, obtaining a plurality of predicted output values and analyzing them. 如請求項第10項所述之細胞品質預測方法,其中該預測輸出值,至 少包含細胞生長時間點、物件偵測、實例分割、物件計數、囊胚透明帶厚度、滋養層細胞數量、滋養層細胞總面積占比、內細胞群總面積占比、細胞大小、細胞分裂均勻程度及細胞碎片化程度結果其中之一或其組合。 The cell quality prediction method as described in claim 10, wherein the prediction output value includes at least one of the results of cell growth time point, object detection, instance segmentation, object counting, blastocyst zona pellucida thickness, trophoblast cell number, trophoblast cell total area ratio, inner cell mass total area ratio, cell size, cell division uniformity and cell fragmentation degree, or a combination thereof. 如請求項第11項所述之細胞品質預測方法,其中該預測輸出值,可藉由串接複數個單張影像之該預測輸出值為一時序性資料。 As described in claim 11, the cell quality prediction method, wherein the predicted output value can be obtained by concatenating the predicted output values of multiple single images into a time series data. 如請求項第11項所述之細胞品質預測方法,其中該細胞生長時間點,係利用影像處理或機器學習方式分析該病歷資料得出,其至少包含細胞卵裂為二細胞、三細胞、四細胞、五細胞、六細胞、七細胞、八細胞、受精至桑葚胚、受精至早期囊胚、受精至囊胚、受精至擴張囊胚時間點其中之一或其組合。 As described in claim 11, the cell quality prediction method, wherein the cell growth time point is obtained by analyzing the medical record data using image processing or machine learning, and includes at least one of the time points of cell cleavage into two cells, three cells, four cells, five cells, six cells, seven cells, eight cells, fertilization to morula, fertilization to early blastocyst, fertilization to blastocyst, and fertilization to expanded blastocyst, or a combination thereof. 一種專家知識參數化方法,該方法具至少一訓練資料集、至少一專家及至少一給定之機率分布模型,該方法之特徵為該專家根據其專家知識將該訓練資料集中之複數個參數分別設立複數個參數區間,或將該訓練資料集中之特定參數排除,產生一結構化資料,藉該結構化資料調變該機率分布模型,使抽象之專家知識作為參數納入演算法;其中該專家至少包含資訊工程師、資料科學家、該資料集相關領域之從業人員、具相關執照之專業人士其中之一或其組合。 A method for parameterizing expert knowledge, the method comprises at least one training data set, at least one expert and at least one given probability distribution model. The method is characterized in that the expert sets a plurality of parameter intervals for a plurality of parameters in the training data set according to his expert knowledge, or excludes specific parameters in the training data set to generate a structured data, and uses the structured data to adjust the probability distribution model so that the abstract expert knowledge is incorporated into the algorithm as a parameter; wherein the expert includes at least one of an information engineer, a data scientist, a practitioner in a field related to the data set, a professional with a relevant license, or a combination thereof. 一種異常數據自動偵測方法,該方法須具至少一訓練資料集、至少一該訓練資料集相關領域之專家,該方法之特徵為藉由資料清洗、資料標準化及資料正規化方法,自動篩選出該訓練資料集中之異常或離群值資料並進行彙整,供該專家進行確認或修改,其中該自動篩選方法 係由該使用者根據其專家知識將該訓練資料集中之複數個參數分別設立複數個參數區間。 A method for automatically detecting abnormal data, which requires at least one training data set and at least one expert in a field related to the training data set. The method is characterized by automatically screening out abnormal or outlier data in the training data set and aggregating them through data cleaning, data standardization and data normalization methods for the expert to confirm or modify. The automatic screening method is that the user sets multiple parameter intervals for multiple parameters in the training data set according to his expert knowledge. 如請求項第15項所述之異常數據自動偵測方法,其中該前處理方法,包含資料清洗、資料標準化、資料正規化、基因演算法、資料分群、影像處理、邊緣偵測、卡方檢定、EM演算法其中之一或其組合。 The method for automatically detecting abnormal data as described in claim 15, wherein the pre-processing method includes one or a combination of data cleaning, data standardization, data normalization, genetic algorithm, data clustering, image processing, edge detection, chi-square test, and EM algorithm. 如請求項第15項所述之異常數據自動偵測方法,其中該資料清洗方法,至少包含篩選出重複資料、不相關資料、極端值、雜訊、缺失資料或結構錯誤資料,並將其進行刪除、補值、取前後值平均、以近似值填補其中之一或其組合。 The method for automatically detecting abnormal data as described in claim item 15, wherein the data cleaning method at least includes screening out duplicate data, irrelevant data, extreme values, noise, missing data or structurally incorrect data, and deleting, filling in values, averaging previous and subsequent values, filling in with approximate values, or a combination thereof. 如請求項第15項所述之異常數據自動偵測方法,其中該資料標準化方法,至少包含極小化極大演算法、標準分數、最大值絕對值標準化、Robust Scaler、平均值、標準差以及會使資料處於0至1間小數之演算法其中之一或其組合。 The method for automatically detecting abnormal data as described in claim 15, wherein the data normalization method includes at least one of the mini-max algorithm, standard score, maximum absolute value normalization, Robust Scaler, mean value, standard deviation, and an algorithm that causes the data to be a decimal between 0 and 1, or a combination thereof. 如請求項第15項所述之異常數據自動偵測方法,其中該資料正規化方法,至少Label Encoding、One-hot Encoding以及會使資料之平均值為0且標準差為1之演算法其中之一或其組合。 The method for automatically detecting abnormal data as described in claim 15, wherein the data normalization method is at least one of Label Encoding, One-hot Encoding, and an algorithm that makes the mean value of the data equal to 0 and the standard deviation equal to 1, or a combination thereof. 、一種細胞品質預測系統,該系統至少包括:一可供其運行之硬體設備;一使用者介面,供使用者進行操作,並於該使用者介面中呈現系統執行結果;一病患之病歷資料;一演算法;一訓練資料集;及一結構化資料,其特徵為:將該訓練資料集以一學習演算法於硬體設備中進行運算,產生一第一機率分布模型及一第一預測模型,利用該第一 預測模型與該第一機率分布模型算出第一預測結果,由該使用者根據其專家知識將該訓練資料集中之複數個參數分別設立複數個參數區間及從該結構化資料中選擇欲納入或排除運算之參數,以調變該結構化資料,再藉由該結構化資料對第一機率模型進行調變,得出一第二機率分布模型及一第二預測模型,利用該第二預測模型及該第二機率分布模型算出第二預測結果。 , a cell quality prediction system, the system at least includes: a hardware device for operation; a user interface for users to operate and present the system execution results in the user interface; a patient's medical history data; an algorithm; a training data set; and a structured data, which is characterized by: the training data set is operated in the hardware device with a learning algorithm to generate a first probability distribution model and a first prediction model, and the first prediction model is used to generate a first probability distribution model and a first prediction model. The first probability distribution model calculates the first prediction result. The user sets a plurality of parameter intervals for the plurality of parameters in the training data set according to his expert knowledge and selects the parameters to be included or excluded from the structured data to adjust the structured data. The first probability model is then adjusted by the structured data to obtain a second probability distribution model and a second prediction model. The second prediction model and the second probability distribution model are used to calculate the second prediction result. 如請求項第20項所述之細胞品質預測系統,其中該硬體設備至少包含儲存單元、運算單元、個人電腦、工業電腦、電腦叢集、雲端運算、筆記型電腦、手機或邊緣運算設備其中之一或其組合。 A cell quality prediction system as described in claim 20, wherein the hardware device comprises at least one or a combination of a storage unit, a computing unit, a personal computer, an industrial computer, a computer cluster, cloud computing, a laptop computer, a mobile phone or an edge computing device. 如請求項第20項所述之細胞品質預測系統,其中該儲存單元包含硬碟、固態硬碟及記憶體其中之一或其組合。 A cell quality prediction system as described in claim 20, wherein the storage unit comprises one or a combination of a hard drive, a solid state drive, and a memory. 如請求項第20項所述之細胞品質預測系統,其中該運算單元,包含中央處理單元、繪圖處理單元及加速處理單元其中之一或其組合。 The cell quality prediction system as described in claim 20, wherein the computing unit includes one or a combination of a central processing unit, a graphics processing unit, and an acceleration processing unit. 如請求項第20項所述之細胞品質預測系統,其中該使用者介面至少包含純文字介面、圖形化介面其中之一或其組合。 A cell quality prediction system as described in claim 20, wherein the user interface comprises at least one of a pure text interface and a graphical interface or a combination thereof. 如請求項第20項所述之細胞品質預測系統,其中該操作至少包含該使用者介面之前後台切換、使用者登入、使用者登出、使用者資訊修改、輸入文字、選擇檔案上傳、選定或排除輸入至機器學習演算法之參數,以專家知識將該訓練資料集中之複數個參數分別設立複數個參數區間,其中之一或其組合。 The cell quality prediction system as described in claim 20, wherein the operation at least includes the foreground and background switching of the user interface, user login, user logout, user information modification, text input, file selection upload, selection or exclusion of parameters input to the machine learning algorithm, and using expert knowledge to set multiple parameter intervals for multiple parameters in the training data set, one of them or a combination thereof. 如請求項第20項所述之細胞品質預測系統,其中該系統執行結果至少包含供該學習演算法運算之原始資料、該學習演算法之預測結果、 視覺化資料、聲音資料、說明文字或數字資料其中之一或其組合。 A cell quality prediction system as described in claim 20, wherein the system execution result includes at least the original data for the learning algorithm to operate, the prediction result of the learning algorithm, visualized data, sound data, descriptive text or digital data, or one or a combination thereof.
TW112115189A 2023-04-24 2023-04-24 A cell quality prediction system and method, as well as an expert knowledge parameterization method. TWI845274B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW112115189A TWI845274B (en) 2023-04-24 2023-04-24 A cell quality prediction system and method, as well as an expert knowledge parameterization method.

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW112115189A TWI845274B (en) 2023-04-24 2023-04-24 A cell quality prediction system and method, as well as an expert knowledge parameterization method.

Publications (2)

Publication Number Publication Date
TWI845274B true TWI845274B (en) 2024-06-11
TW202443594A TW202443594A (en) 2024-11-01

Family

ID=92541705

Family Applications (1)

Application Number Title Priority Date Filing Date
TW112115189A TWI845274B (en) 2023-04-24 2023-04-24 A cell quality prediction system and method, as well as an expert knowledge parameterization method.

Country Status (1)

Country Link
TW (1) TWI845274B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105408746A (en) * 2013-02-28 2016-03-16 普罗吉涅股份有限公司 Apparatus, method, and system for image-based human embryo cell classification
US20200395117A1 (en) * 2019-06-14 2020-12-17 Cycle Clarity, LLC Adaptive image processing method and system in assisted reproductive technologies
TW202119430A (en) * 2019-07-24 2021-05-16 康善生技股份有限公司 Detecting, evaluating and predicting system for cancer risk
CN112889088A (en) * 2018-10-31 2021-06-01 合同会社予幸集团中央研究所 Program, learning model, information processing device, information processing method, information display method, and learning model manufacturing method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105408746A (en) * 2013-02-28 2016-03-16 普罗吉涅股份有限公司 Apparatus, method, and system for image-based human embryo cell classification
CN112889088A (en) * 2018-10-31 2021-06-01 合同会社予幸集团中央研究所 Program, learning model, information processing device, information processing method, information display method, and learning model manufacturing method
US20200395117A1 (en) * 2019-06-14 2020-12-17 Cycle Clarity, LLC Adaptive image processing method and system in assisted reproductive technologies
TW202119430A (en) * 2019-07-24 2021-05-16 康善生技股份有限公司 Detecting, evaluating and predicting system for cancer risk

Similar Documents

Publication Publication Date Title
Wang et al. Artificial intelligence in reproductive medicine
WO2020157761A1 (en) Automated evaluation of embryo implantation potential
JP7072067B2 (en) Systems and methods for estimating embryo viability
Liao et al. Development of deep learning algorithms for predicting blastocyst formation and quality by time-lapse monitoring
US20160078275A1 (en) Apparatus, Method, and System for Image-Based Human Embryo Cell Classification
US20240185567A1 (en) System and method for outcome evaluations on human ivf-derived embryos
Erlich et al. Pseudo contrastive labeling for predicting IVF embryo developmental potential
US20230018456A1 (en) Methods and systems for determining optimal decision time related to embryonic implantation
CN113610118B (en) Glaucoma diagnosis method, device, equipment and method based on multitasking course learning
Lau et al. Embryo staging with weakly-supervised region selection and dynamically-decoded predictions
CN116434841A (en) Embryo evaluation method and device based on multi-modal data
TWI845274B (en) A cell quality prediction system and method, as well as an expert knowledge parameterization method.
JP7535572B2 (en) Automated evaluation of quality assurance metrics used in assisted reproductive procedures
US10748288B2 (en) Methods and systems for determining quality of an oocyte
WO2023154851A1 (en) Integrated framework for human embryo ploidy prediction using artificial intelligence
Zabari et al. Delineating the heterogeneity of preimplantation development via unsupervised clustering of embryo candidates for transfer using automated, accurate and standardized morphokinetic annotation
US20240249142A1 (en) Methods and systems for embryo classificiation
Eswaran et al. Deep Learning Algorithms for Timelapse Image Sequence-Based Automated Blastocyst Quality Detection
Yang Enhancing Clinical IVF Embryo Selection through the Integration of Artificial Intelligence and Bayesian Statistics
AU2019101174A4 (en) Systems and methods for estimating embryo viability
Bansal Empowering Prenatal Care Using AI Image Processing for Early Detection of Pregnancy Complications
US20240312560A1 (en) Systems and methods for non-invasive preimplantation embryo genetic screening
CN118115783B (en) Cornea staining analysis method based on deep learning and related training method and system
JP7523680B2 (en) Automated screening for aneuploidy using arbitrated populations
CN117975081A (en) Pathological image classification method of multistage self-attention screening mechanism