musn129 : BERT, SQuAD 를 이용한 미국주식뉴스 요약 서비스

https://news.hada.io/topic?id=7647

musn129(머선129): 미국 주식 뉴스 요약 서비스

https://musn129.com/

미국 주식 뉴스를 모아서, 왜 오르고 떨어진 건지 요약해주는 서비스

    Fully Serverless (AWS Lambda + Cloudfront Lambda@Edge) 로 구성
    미국 주식에 관한 News 를 크롤링 한 후 SQuAD 라는 머신러닝 모델로 결과를 추론 (ex. Why Nvidia stock goes up?)
        모델은 Lambda 에서 충분히 돌릴 수 있도록 경량화된 모델 사용 (Distillation BERT 계열)
    데이터를 만드는 모든 Lambda 는 AWS Free Tier 범위 내에서 처리되며, 추가적인 비용은 S3 및 Route 53 도메인 비용 뿐 (현재 한달에 $2 정도)
        모든 데이터는 S3 에 저장되며, 따로 DB 는 사용하지 않음
    SSR (NextJs) + Cloudfront Lambda@Edge (us-east-1)

------------------------------------

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova
https://arxiv.org/abs/1810.04805

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications.
BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE score to 80.5% (7.7% point absolute improvement), MultiNLI accuracy to 86.7% (4.6% absolute improvement), SQuAD v1.1 question answering Test F1 to 93.2 (1.5 point absolute improvement) and SQuAD v2.0 Test F1 to 83.1 (5.1 point absolute improvement).

SQuAD : The Stanford Question Answering Dataset
https://rajpurkar.github.io/SQuAD-explorer/

Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.

'Machine Learning > News' 카테고리의 다른 글

Explain Paper : 논문 읽어주는 GPT-3 서비스 (0)	2022.11.07
알리바바 AI MaaS 플랫폼 모델스코프 공개 (0)	2022.11.07
자율주행용 4D 레이더 인공지느 모델과 데이터셋 공개 (0)	2022.10.24
카카오브레인 한국어 언어 모델 KoGPT 공개 (0)	2022.10.17
Whisper - OpenAI가 오픈소스로 공개한 다국어 음성 인식 시스템(ASR) (0)	2022.09.26

Physics Programmer

musn129 : BERT, SQuAD 를 이용한 미국주식뉴스 요약 서비스

'Machine Learning > News' 카테고리의 다른 글

티스토리툴바

musn129 : BERT, SQuAD 를 이용한 미국주식뉴스 요약 서비스

'Machine Learning > News' 카테고리의 다른 글

관련글

티스토리툴바