mirror of
https://github.com/qaiu/netdisk-fast-download.git
synced 2025-12-16 20:33:03 +00:00
feat: 完善JavaScript解析器功能
- 优化JsScriptLoader,支持JAR包内和文件系统的自动资源文件发现 - 移除预定义文件列表,完全依赖自动检测 - 添加getNoRedirect方法支持重定向处理 - 添加sendMultipartForm方法支持文件上传 - 添加代理配置支持 - 修复JSON解析的压缩处理问题 - 添加默认请求头支持(Accept-Encoding、User-Agent、Accept-Language) - 更新文档,修正导出方式说明 - 优化README.md结构,删除不符合模块定位的内容 - 升级parser版本到10.2.1
This commit is contained in:
@@ -5,6 +5,7 @@
|
||||
- 语言/构建:Java 17 / Maven
|
||||
- 关键接口:cn.qaiu.parser.IPanTool(返回 Future<List<FileInfo>>),各站点位于 parser/src/main/java/cn/qaiu/parser/impl
|
||||
- 数据模型:cn.qaiu.entity.FileInfo(统一对外文件项)
|
||||
- JavaScript解析器:支持使用JavaScript编写自定义解析器,位于 parser/src/main/resources/custom-parsers/
|
||||
|
||||
---
|
||||
|
||||
@@ -75,6 +76,51 @@ List<FileInfo> files = tool.parseFileListSync();
|
||||
- 异步方法仍可用:parse()、parseFileList()、parseById() 返回 Future 对象
|
||||
- 生成短链 path:ParserCreate.genPathSuffix()(用于页面/服务端聚合)。
|
||||
|
||||
## JavaScript解析器快速开始
|
||||
|
||||
除了Java解析器,还支持使用JavaScript编写自定义解析器:
|
||||
|
||||
### 1. 创建JavaScript解析器
|
||||
|
||||
在 `parser/src/main/resources/custom-parsers/` 目录下创建 `.js` 文件:
|
||||
|
||||
```javascript
|
||||
// ==UserScript==
|
||||
// @name 我的解析器
|
||||
// @type my_parser
|
||||
// @displayName 我的网盘
|
||||
// @description 使用JavaScript实现的网盘解析器
|
||||
// @match https?://example\.com/s/(?<KEY>\w+)
|
||||
// @author yourname
|
||||
// @version 1.0.0
|
||||
// ==/UserScript==
|
||||
|
||||
/**
|
||||
* 解析单个文件下载链接
|
||||
* @param {ShareLinkInfo} shareLinkInfo - 分享链接信息
|
||||
* @param {JsHttpClient} http - HTTP客户端
|
||||
* @param {JsLogger} logger - 日志对象
|
||||
* @returns {string} 下载链接
|
||||
*/
|
||||
function parse(shareLinkInfo, http, logger) {
|
||||
var url = shareLinkInfo.getShareUrl();
|
||||
var response = http.get(url);
|
||||
return response.body();
|
||||
}
|
||||
```
|
||||
|
||||
### 2. JavaScript解析器特性
|
||||
|
||||
- **重定向处理**:支持`getNoRedirect()`方法获取302重定向的真实链接
|
||||
- **代理支持**:自动支持HTTP/SOCKS代理配置
|
||||
- **类型提示**:提供完整的JSDoc类型定义
|
||||
- **热加载**:修改后重启应用即可生效
|
||||
|
||||
### 3. 详细文档
|
||||
|
||||
- [JavaScript解析器开发指南](JAVASCRIPT_PARSER_GUIDE.md)
|
||||
- [自定义解析器开发指南](CUSTOM_PARSER_GUIDE.md)
|
||||
|
||||
---
|
||||
|
||||
## 1. 解析器约定
|
||||
@@ -97,196 +143,125 @@ FileInfo 关键字段(节选):
|
||||
|
||||
---
|
||||
|
||||
## 2. 文件列表解析规范(按给定 JSON)
|
||||
目标 JSON(摘要):
|
||||
- 列表路径:data.data[]
|
||||
- 每项结构:item.data(含 attributes、id、type、relationships)
|
||||
- type:"file" 或 "folder"
|
||||
## 2. 文件列表解析规范
|
||||
|
||||
字段映射建议:
|
||||
- 通用
|
||||
- fileId ← data.id
|
||||
- createTime ← data.attributes.created_at(若格式不一致,上层再统一格式化)
|
||||
- updateTime ← data.attributes.updated_at
|
||||
- fileType:
|
||||
- 对文件用 data.attributes.mimetype 或固定 "file"
|
||||
- 对目录固定 "folder"
|
||||
- 文件(type="file")
|
||||
- fileName ← 优先 attributes.basename(示例:"GBT+28448-2019.pdf"),无则用 attributes.name
|
||||
- sizeStr ← attributes.filesize(示例:"18MB")
|
||||
- size ← 尝试用 FileSizeConverter.convertToBytes(sizeStr),失败则置空
|
||||
- parserUrl ← attributes.file_url(示例:BilPan://downLoad?id=...)
|
||||
- filePath/parentId ← relationships.parent.data.id(可放到 extParameters.parentId)
|
||||
- previewUrl/thumbnail ← attributes.thumbnail(可选)
|
||||
- 目录(type="folder")
|
||||
- fileName ← attributes.name
|
||||
- size/sizeStr ← 置空
|
||||
- 统计字段(如 items/trashed_items)可入 extParameters
|
||||
### 通用解析原则
|
||||
|
||||
边界与兼容:
|
||||
- attributes.filesize 可能为空或为非标准字符串;转换失败时保留 sizeStr,忽略 size。
|
||||
- attributes.file_url 可能为占位协议(BilPan://),直链转换在下载阶段处理。
|
||||
- relationships.* 可能为空,读取前需判空。
|
||||
1. **数据结构识别**:根据网盘API响应结构确定文件列表的路径
|
||||
2. **字段映射**:将网盘特定字段映射到统一的`FileInfo`对象
|
||||
3. **类型区分**:正确识别文件和文件夹类型
|
||||
4. **数据转换**:处理时间格式、文件大小等数据格式转换
|
||||
|
||||
### FileInfo字段映射指南
|
||||
|
||||
| FileInfo字段 | 说明 | 映射建议 |
|
||||
|-------------|------|----------|
|
||||
| `fileName` | 文件名 | 优先使用文件名字段,无则使用标题字段 |
|
||||
| `fileId` | 文件ID | 使用网盘提供的唯一标识符 |
|
||||
| `fileType` | 文件类型 | "file"或"folder" |
|
||||
| `size` | 文件大小(字节) | 转换为字节数,文件夹可为0 |
|
||||
| `sizeStr` | 文件大小(可读) | 保持网盘原始格式或转换 |
|
||||
| `createTime` | 创建时间 | 统一时间格式 |
|
||||
| `updateTime` | 更新时间 | 统一时间格式 |
|
||||
| `parserUrl` | 下载链接 | 网盘提供的下载URL |
|
||||
| `previewUrl` | 预览链接 | 可选,网盘提供的预览URL |
|
||||
|
||||
### 常见数据转换
|
||||
|
||||
- **文件大小**:使用`FileSizeConverter`进行字符串与字节数转换
|
||||
- **时间格式**:统一转换为标准时间格式
|
||||
- **文件类型**:根据网盘API判断文件/文件夹类型
|
||||
|
||||
### 解析注意事项
|
||||
|
||||
- **数据验证**:检查必要字段是否存在,避免空指针异常
|
||||
- **格式兼容**:处理不同网盘的数据格式差异
|
||||
- **错误处理**:转换失败时提供合理的默认值
|
||||
- **扩展字段**:额外信息可存储在`extParameters`中
|
||||
|
||||
### 解析示例
|
||||
|
||||
伪代码(parseFileList 核心片段):
|
||||
```java
|
||||
// 仅示意,按项目 Json 工具替换
|
||||
JsonObject root = ...; // 接口返回
|
||||
JsonArray arr = root.getJsonObject("data").getJsonArray("data");
|
||||
List<FileInfo> list = new ArrayList<>();
|
||||
for (JsonObject wrap : arr) {
|
||||
JsonObject d = wrap.getJsonObject("data");
|
||||
String type = d.getString("type");
|
||||
JsonObject attrs = d.getJsonObject("attributes");
|
||||
FileInfo fi = new FileInfo();
|
||||
fi.setFileId(d.getString("id"));
|
||||
fi.setCreateTime(attrs.getString("created_at"));
|
||||
fi.setUpdateTime(attrs.getString("updated_at"));
|
||||
if ("file".equals(type)) {
|
||||
String basename = attrs.getString("basename");
|
||||
fi.setFileName(basename != null ? basename : attrs.getString("name"));
|
||||
fi.setFileType(attrs.getString("mimetype", "file"));
|
||||
String sizeStr = attrs.getString("filesize");
|
||||
fi.setSizeStr(sizeStr);
|
||||
try { if (sizeStr != null) fi.setSize(FileSizeConverter.convertToBytes(sizeStr)); } catch (Exception ignore) {}
|
||||
fi.setParserUrl(attrs.getString("file_url"));
|
||||
// parentId(可选)
|
||||
JsonObject rel = d.getJsonObject("relationships");
|
||||
if (rel != null) {
|
||||
JsonObject p = rel.getJsonObject("parent");
|
||||
if (p != null && p.getJsonObject("data") != null) {
|
||||
String pid = p.getJsonObject("data").getString("id");
|
||||
Map<String,Object> ext = new HashMap<>();
|
||||
ext.put("parentId", pid);
|
||||
fi.setExtParameters(ext);
|
||||
}
|
||||
// 通用解析模式示例
|
||||
JsonObject root = response.json(); // 获取API响应
|
||||
JsonArray fileList = root.getJsonArray("files"); // 根据实际API调整路径
|
||||
List<FileInfo> result = new ArrayList<>();
|
||||
|
||||
for (JsonObject item : fileList) {
|
||||
FileInfo fileInfo = new FileInfo();
|
||||
|
||||
// 基本字段映射
|
||||
fileInfo.setFileName(item.getString("name"));
|
||||
fileInfo.setFileId(item.getString("id"));
|
||||
fileInfo.setFileType(item.getString("type").equals("file") ? "file" : "folder");
|
||||
|
||||
// 文件大小处理
|
||||
String sizeStr = item.getString("size");
|
||||
if (sizeStr != null) {
|
||||
fileInfo.setSizeStr(sizeStr);
|
||||
try {
|
||||
fileInfo.setSize(FileSizeConverter.convertToBytes(sizeStr));
|
||||
} catch (Exception e) {
|
||||
// 转换失败时保持sizeStr,size为0
|
||||
}
|
||||
}
|
||||
} else {
|
||||
fi.setFileName(attrs.getString("name"));
|
||||
fi.setFileType("folder");
|
||||
}
|
||||
list.add(fi);
|
||||
|
||||
// 时间处理
|
||||
fileInfo.setCreateTime(formatTime(item.getString("createTime")));
|
||||
fileInfo.setUpdateTime(formatTime(item.getString("updateTime")));
|
||||
|
||||
// 下载链接
|
||||
fileInfo.setParserUrl(item.getString("downloadUrl"));
|
||||
|
||||
result.add(fileInfo);
|
||||
}
|
||||
return Future.succeededFuture(list);
|
||||
```
|
||||
|
||||
---
|
||||
### JavaScript解析器示例
|
||||
|
||||
## 3. curl 转 Java 11 HttpClient 示例
|
||||
以 GET 为例(来源:developer-oss.lanrar.com):
|
||||
```java
|
||||
HttpClient client = HttpClient.newHttpClient();
|
||||
String q = "<替换为长查询串>";
|
||||
String url = "https://developer-oss.lanrar.com/file/?" + URLEncoder.encode(q, StandardCharsets.UTF_8);
|
||||
HttpRequest req = HttpRequest.newBuilder(URI.create(url))
|
||||
.header("accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7")
|
||||
.header("accept-language", "zh-CN,zh;q=0.9")
|
||||
.header("cache-control", "max-age=0")
|
||||
.header("dnt", "1")
|
||||
.header("priority", "u=0, i")
|
||||
.header("referer", "https://developer-oss.lanrar.com/file/?" + q)
|
||||
.header("sec-ch-ua", "\"Chromium\";v=\"140\", \"Not=A?Brand\";v=\"24\", \"Microsoft Edge\";v=\"140\"")
|
||||
.header("sec-ch-ua-mobile", "?0")
|
||||
.header("sec-ch-ua-platform", "\"macOS\"")
|
||||
.header("sec-fetch-dest", "document")
|
||||
.header("sec-fetch-mode", "navigate")
|
||||
.header("sec-fetch-site", "same-origin")
|
||||
.header("upgrade-insecure-requests", "1")
|
||||
.header("user-agent", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/140.0.0.0 Safari/537.36 Edg/140.0.0.0")
|
||||
.header("Cookie", "acw_tc=<acw_tc>; cdn_sec_tc=<cdn_sec_tc>; acw_sc__v2=<acw_sc__v2>")
|
||||
.GET()
|
||||
.build();
|
||||
HttpResponse<String> resp = client.send(req, HttpResponse.BodyHandlers.ofString());
|
||||
System.out.println(resp.statusCode());
|
||||
System.out.println(resp.body());
|
||||
```
|
||||
|
||||
POST 示例(来源:Weiyun Share BatchDownload,使用 JSON):
|
||||
```java
|
||||
HttpClient client = HttpClient.newHttpClient();
|
||||
String url = "https://share.weiyun.com/webapp/json/weiyunShare/WeiyunShareBatchDownload?refer=chrome_mac&g_tk=1399845656&r=0.3925692266635241";
|
||||
String json = "{...与 curl/requests 等价 JSON 负载,使用占位参数...}";
|
||||
HttpRequest req = HttpRequest.newBuilder(URI.create(url))
|
||||
.header("accept", "application/json, text/plain, */*")
|
||||
.header("content-type", "application/json;charset=UTF-8")
|
||||
.header("origin", "https://share.weiyun.com")
|
||||
.header("referer", "https://share.weiyun.com/<shareKey>")
|
||||
.header("user-agent", "Mozilla/5.0 ...")
|
||||
.header("Cookie", "uin=<uin>; skey=<skey>; p_skey=<p_skey>; ...")
|
||||
.POST(HttpRequest.BodyPublishers.ofString(json, StandardCharsets.UTF_8))
|
||||
.build();
|
||||
HttpResponse<String> resp = client.send(req, HttpResponse.BodyHandlers.ofString());
|
||||
```
|
||||
提示:
|
||||
- Cookie/Token 使用占位并从外部注入,避免硬编码与泄露。
|
||||
- r/g_tk 等参数如需计算,请在实现类中封装。
|
||||
|
||||
---
|
||||
|
||||
## 4. IntelliJ IDEA `.http` 调试样例
|
||||
保存为 `requests.http`,可配合环境变量使用。
|
||||
|
||||
GET:
|
||||
```http
|
||||
### 开发者资源 GET 示例
|
||||
GET https://developer-oss.lanrar.com/file/?{{q}}
|
||||
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
|
||||
accept-language: zh-CN,zh;q=0.9
|
||||
cache-control: max-age=0
|
||||
dnt: 1
|
||||
priority: u=0, i
|
||||
referer: https://developer-oss.lanrar.com/file/?{{q}}
|
||||
sec-ch-ua: "Chromium";v="140", "Not=A?Brand";v="24", "Microsoft Edge";v="140"
|
||||
sec-ch-ua-mobile: ?0
|
||||
sec-ch-ua-platform: "macOS"
|
||||
sec-fetch-dest: document
|
||||
sec-fetch-mode: navigate
|
||||
sec-fetch-site: same-origin
|
||||
upgrade-insecure-requests: 1
|
||||
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/140.0.0.0 Safari/537.36 Edg/140.0.0.0
|
||||
Cookie: acw_tc={{acw_tc}}; cdn_sec_tc={{cdn_sec_tc}}; acw_sc__v2={{acw_sc_v2}}
|
||||
|
||||
> {% client.log("status: " + response.status); %}
|
||||
|
||||
### 环境变量(可在 HTTP Client Environment 中配置)
|
||||
@q=替换为实际长查询串
|
||||
@acw_tc=your_acw_tc
|
||||
@cdn_sec_tc=your_cdn_sec_tc
|
||||
@acw_sc_v2=your_acw_sc__v2
|
||||
```
|
||||
|
||||
POST:
|
||||
```http
|
||||
### Weiyun 批量下载 POST 示例
|
||||
POST https://share.weiyun.com/webapp/json/weiyunShare/WeiyunShareBatchDownload?refer=chrome_mac&g_tk={{g_tk}}&r={{r}}
|
||||
accept: application/json, text/plain, */*
|
||||
content-type: application/json;charset=UTF-8
|
||||
origin: https://share.weiyun.com
|
||||
referer: https://share.weiyun.com/{{share_key}}
|
||||
user-agent: Mozilla/5.0 ...
|
||||
Cookie: uin={{uin}}; skey={{skey}}; p_skey={{p_skey}}; p_uin={{p_uin}}; wyctoken={{wyctoken}}
|
||||
|
||||
{
|
||||
"req_header": "{...}",
|
||||
"req_body": "{...}"
|
||||
```javascript
|
||||
function parseFileList(shareLinkInfo, http, logger) {
|
||||
var response = http.get(shareLinkInfo.getShareUrl());
|
||||
var data = response.json();
|
||||
|
||||
var fileList = [];
|
||||
var files = data.files || data.data || data.items; // 根据实际API调整
|
||||
|
||||
for (var i = 0; i < files.length; i++) {
|
||||
var file = files[i];
|
||||
var fileInfo = {
|
||||
fileName: file.name || file.title,
|
||||
fileId: file.id,
|
||||
fileType: file.type === "file" ? "file" : "folder",
|
||||
size: file.size || 0,
|
||||
sizeStr: file.sizeStr || formatSize(file.size),
|
||||
createTime: file.createTime,
|
||||
updateTime: file.updateTime,
|
||||
parserUrl: file.downloadUrl || file.url
|
||||
};
|
||||
|
||||
fileList.push(fileInfo);
|
||||
}
|
||||
|
||||
return fileList;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. 开发流程建议
|
||||
## 3. 开发流程建议
|
||||
- 新增站点:在 impl 下新增 Tool,实现 IPanTool,复用 PanBase/模板类;补充单测。
|
||||
- 字段不全:尽量回填 sizeStr/createTime 等便于前端展示;不可用字段置空。
|
||||
- 单测:放置于 parser/src/test/java,尽量添加 1-2 个 happy path + 1 个边界用例。
|
||||
|
||||
## 6. 常见问题
|
||||
## 4. 常见问题
|
||||
- 容量解析失败:保留 sizeStr,并忽略 size;避免抛出异常影响整体列表。
|
||||
- 协议占位下载链接:统一放至 parserUrl,直链转换由下载阶段处理。
|
||||
- 鉴权:Cookie/Token 过期问题由上层刷新或外部注入处理;解析器保持无状态最佳。
|
||||
|
||||
---
|
||||
|
||||
## 7. 参考
|
||||
## 5. 参考
|
||||
- FileInfo:parser/src/main/java/cn/qaiu/entity/FileInfo.java
|
||||
- IPanTool:parser/src/main/java/cn/qaiu/parser/IPanTool.java
|
||||
- FileSizeConverter:parser/src/main/java/cn/qaiu/util/FileSizeConverter.java
|
||||
|
||||
Reference in New Issue
Block a user