career-horoscope

1111

03/04/2026

03/03/2026

// =============================// SemanticChunkUtil.java// ============================= import java.util.; import java.util.regex.; public class SemanticChunkUtil { } // =============================// FileChunkWriter.java// ============================= import java.nio.file.*;import java.io.IOException;import java.util.List; public class FileChunkWriter { } // =============================// FileChunkReader.java// ============================= import java.nio.file.; import java.io.IOException; import java.util.; public class FileChunkReader { Read more…

career-horoscope

02/18/2026

OCR JSON↓Flatten CSV↓Rule-based CSV (block_id, page_number, element_ids, merged_text)↓Token-safe chunk CSV ← 지금 만들 단계↓LLM 처리

career-horoscope

설계 요약 CSV 읽기 page_number 기준으로 텍스트 병합 rule 기반 문단 분리 빈 줄 번호 패턴 헤더 패턴 SemanticBlock 객체로 저장 아래 코드는: CSV 읽고 페이지 단위 병합 rule-based 분리 semantic 블록 리스트 반환 이 코드가 하는 일 OCR JSON↓Flatten Read more…

career-horoscope

AstroScent

Category: career-horoscope

2222

1111

–

eee

semantic chunking

긴 문서를 LLM으로 처리하는 방법: 토큰 분할 전략과 Timeout 방지 구조 정리

xssdd

Comments

Archives

Categories