Javascript must be enabled to continue!
RingChains Graph-based Summarizer and Enhanced Large Language Models for Summarizing Long Documents
View through CrossRef
Large language models (LLMs) have influenced real-world applications after ChatGPT appeared. Although powerful LLMs produce high quality summaries, it remains challenging for LLMs to perform the summary task for long documents. First, LLMs must compute a large number of unimportant input tokens while LLMs perform more than billions operations per an input token because of the complicated architecture and large model sizes. Second, most standard LLMs have a limited context window size. If the number of context tokens is increased by a factor of n, both the required computational resources and the running time scale as n2 in a Transformer architecture or as n√ n in a sparse Transformer architecture. Third, using LLMs typically requires either an internet connection or high-performance local hardware. Fourth, LLMs need vast amounts of training data, and they still cannot entirely avoid hallucinations. Some real-world documents, such as classified files, cannot be used for training, cannot be uploaded to the internet, and cannot tolerate hallucinations. Moreover, approximately one billion people worldwide own computers but lack internet access. These individuals have already been left behind in the internet revolution. We must ensure they are not behind again from the AI revolution. This dissertation proposes RingChains topology graph-based summarizer, which can be implemented to work on any computer. It offers fast execution, unlimited input tokens, high-quality summaries, no training process, and no generating hallucinations. RingChains processed 500 government reports from the zeroSCROLL dataset in 22.06 seconds, whereas GPT40 took 5,749.04 seconds, and both models achieved almost the same level of accuracy. RingChains is particularly suited for domains like classified documents and can help those people with computers but non internet connection participate in the AI revolution. This dissertation also present a RingChains_LLMs , a system significantly reduces computational resource, running time, cost and handle limited input tokens of small window size LLMs but avoids the expensive process of adjusting architecture or of additional training steps to expand the context window size of LLMs . Users can obtain high-quality summaries comparable to those of powerful LLMs while greatly reducing both costs and running time by deploying the RingChains-LLMs system. Both the open-source user application of RingChains and RingChains_LLMs are available on my GitHub (tamdoancong/application).
Title: RingChains Graph-based Summarizer and Enhanced Large Language Models for Summarizing Long Documents
Description:
Large language models (LLMs) have influenced real-world applications after ChatGPT appeared.
Although powerful LLMs produce high quality summaries, it remains challenging for LLMs to perform the summary task for long documents.
First, LLMs must compute a large number of unimportant input tokens while LLMs perform more than billions operations per an input token because of the complicated architecture and large model sizes.
Second, most standard LLMs have a limited context window size.
If the number of context tokens is increased by a factor of n, both the required computational resources and the running time scale as n2 in a Transformer architecture or as n√ n in a sparse Transformer architecture.
Third, using LLMs typically requires either an internet connection or high-performance local hardware.
Fourth, LLMs need vast amounts of training data, and they still cannot entirely avoid hallucinations.
Some real-world documents, such as classified files, cannot be used for training, cannot be uploaded to the internet, and cannot tolerate hallucinations.
Moreover, approximately one billion people worldwide own computers but lack internet access.
These individuals have already been left behind in the internet revolution.
We must ensure they are not behind again from the AI revolution.
This dissertation proposes RingChains topology graph-based summarizer, which can be implemented to work on any computer.
It offers fast execution, unlimited input tokens, high-quality summaries, no training process, and no generating hallucinations.
RingChains processed 500 government reports from the zeroSCROLL dataset in 22.
06 seconds, whereas GPT40 took 5,749.
04 seconds, and both models achieved almost the same level of accuracy.
RingChains is particularly suited for domains like classified documents and can help those people with computers but non internet connection participate in the AI revolution.
This dissertation also present a RingChains_LLMs , a system significantly reduces computational resource, running time, cost and handle limited input tokens of small window size LLMs but avoids the expensive process of adjusting architecture or of additional training steps to expand the context window size of LLMs .
Users can obtain high-quality summaries comparable to those of powerful LLMs while greatly reducing both costs and running time by deploying the RingChains-LLMs system.
Both the open-source user application of RingChains and RingChains_LLMs are available on my GitHub (tamdoancong/application).
Related Results
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
The actual use of classroom language is principally limited to the classroom environment. As far as foreign language learning is concerned, the classroom often turns out to be the ...
Graph convolutional neural networks for 3D data analysis
Graph convolutional neural networks for 3D data analysis
(English) Deep Learning allows the extraction of complex features directly from raw input data, eliminating the need for hand-crafted features from the classical Machine Learning p...
Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program
Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program
Abstract
Funding Acknowledgements
Type of funding sources: None.
INTRODUCTION Patients with heart failure (HF)...
Bilangan Terhubung Titik Pelangi pada Graf Garis dan Graf Tengah dari Hasil Operasi Comb Graf Bintang C<sub>3</sub> dan Graf Bintang S<sub>n</sub>
Bilangan Terhubung Titik Pelangi pada Graf Garis dan Graf Tengah dari Hasil Operasi Comb Graf Bintang C<sub>3</sub> dan Graf Bintang S<sub>n</sub>
Penelitian ini bertujuan menentukan bilangan terhubung titik pelangi (rainbow vertex connection number) pada graf garis dan graf tengah yang diperoleh dari hasil operasi comb antar...
Direct tree decomposition of geometric constraint graphs
Direct tree decomposition of geometric constraint graphs
The evolution of constraint based geometric models is tightly tied to parametric and feature-based Computer-Aided Design (CAD) systems. Since the introduction of parametric design ...
Bootstrapping a Biodiversity Knowledge Graph
Bootstrapping a Biodiversity Knowledge Graph
The "biodiversity knowledge graph" is a nice metaphor for connecting biodiversity data sources, but can we actually build it? Do we have sufficient linked data available? Given tha...
Abstract 902: Explainable AI: Graph machine learning for response prediction and biomarker discovery
Abstract 902: Explainable AI: Graph machine learning for response prediction and biomarker discovery
Abstract
Accurately predicting drug sensitivity and understanding what is driving it are major challenges in drug discovery. Graphs are a natural framework for captu...

