Commit Graph

201 Commits

Author SHA1 Message Date
rhysd b68624f2f3 support HTTP and HTTPS proxies (fix #103) 2020-01-20 17:02:43 +09:00
Sunshine a9d114d04d
Merge pull request #105 from rhysd/refactor-main
Refactoring for main.rs to address several issues
2020-01-20 01:10:29 -05:00
rhysd 4e4ebe9c98 refactor main to address several issues
Addressed issues:

- when specified URL is invalid, it exited successfully with doing
  nothing. There was no way why it does not work for users
- it exited successfully even if invalid User-Agent value is specified
- it created file twice on `--output` option specified. It may cause an
  issue when some file watcher (e.g. FsEvents on macOS) is watching

Improvements:
- handle errors with `Result::expect` consistently it correctly exits
  with non-zero status on error
- define `Output` enum for handling both stdout and file outputs
2020-01-15 16:52:20 +09:00
rhysd 1779f4a374 better comments for JS_DOM_EVENT_ATTRS constant 2020-01-15 14:33:27 +09:00
rhysd 26e89ae6d3 use complete list of DOM event handlers 2020-01-15 13:58:09 +09:00
rhysd 69d99b69e8 remove . in line comment 2020-01-13 23:47:07 +09:00
Emi Simpson 05985583f0
Switch timestamps from rfc822 local time to iso8601 UTC 2020-01-10 14:30:35 -05:00
Emi Simpson 651fa716b4
Clean user, pass, and fragment from URL before writing 2020-01-10 14:18:15 -05:00
rhysd 67b79e92f9 simplify &x.into_iter() to x.iter() 2020-01-10 14:45:02 +09:00
rhysd b51f41fe34 trim attribute values 2020-01-10 14:41:05 +09:00
rhysd 6f158dc6db compare value of 'rel' properties in case-insensitive 2020-01-10 13:52:31 +09:00
rhysd 8d7052b39c ignore preload and prefetch sources
since all resources are embedded as data URL.
2020-01-09 18:18:21 +09:00
rhysd 660511b8a0 define link type of <link> element as enum and prefer match statement
since match statement checks exhaustiveness
2020-01-09 16:55:42 +09:00
Emi Simpson 9be3982dc6
Added --no-context flag to disable adding context comment 2020-01-08 19:00:53 -05:00
Emi Simpson 27c9fb4cd3
Added comment indicating the context under which the page was downloaded 2020-01-08 18:51:18 -05:00
rhysd 6e99ad13e7 upgrade reqwest to v0.10.0
This will improve build time and binary size as follows:

* Before

- **Compile targets**: 220
- **Build time**: `cargo build --release  1264.95s user 39.72s system 335% cpu 6:29.14 total`
- **Binary size**: 6578568 bytes

* After

- **Compile targets**: 170
- **Build time**: `cargo build --release  1130.64s user 32.15s system 359% cpu 5:23.69 total`
- **Binary size**: 6107088 bytes

* Differences

- **Compile targets**: 1.29x smaller
- **Build time**: 1.23x faster
- **Binary size**: 1.07x smaller
2020-01-07 14:22:32 +09:00
Sunshine 413dd66886
Merge pull request #96 from rhysd/refactorings
Refactorings
2020-01-05 18:46:31 -05:00
rhysd dc7ec6e7a8 remove more redundant type annotations 2020-01-04 16:33:11 +09:00
rhysd ed879231af fix test code was broken by refactoring 2020-01-04 08:07:19 +09:00
rhysd ddf4b8ac13 prefer &str to String for reducing allocations 2020-01-04 08:05:02 +09:00
rhysd 84c13f0605 prefer unwrap_or_default to unwrap_or 2020-01-04 07:58:29 +09:00
rhysd ce03e0e487 reduce allocation on checking DOM attributes and do not hard-code number of elements of array constant
`to_lower` allocates new string but the allocation is not necessary
here.
2020-01-04 07:52:47 +09:00
rhysd 63e19998d0 reduce clones and fix some code styles and redundant code 2020-01-04 07:49:26 +09:00
rhysd 75fb6961ed migrate to Rust 2018 2020-01-03 00:33:49 +09:00
Sunshine 5ba8931502
Merge pull request #92 from snshn/output-file-option
Add option for saving output to file
2019-12-26 18:13:15 -05:00
Sunshine 88ffde0c3b wipe integrity attributes 2019-12-26 09:44:01 -05:00
Sunshine bfb97bd062 add option for saving output to file 2019-12-26 00:45:20 -05:00
Sunshine 295931041c
Merge pull request #80 from Alch-Emi/lazyload
Add support for lazy loaded images
2019-12-24 17:11:21 -05:00
Emi Simpson dab4ae6965
Merged Y2Z/master with Alch-Emi/lazyload 2019-12-24 10:07:56 -05:00
Sunshine c7fc121c7c use clean URLs as hashmap keys 2019-12-18 11:49:38 -05:00
Sunshine 9ff9dd0928
Merge pull request #82 from snshn/str
implement str!() macro
2019-12-13 03:51:44 -05:00
Emi Simpson 3d4a932ac1
Merge Y2Z/master, fix conflicts between shared-client & resolve-css 2019-12-12 19:29:21 -05:00
Sunshine 9fe913d853 implement str!() macro 2019-12-11 01:36:14 -05:00
Sunshine 862489e41b Get rid of brackets around URLs 2019-12-11 01:17:00 -05:00
Emi Simpson 322ab41b8c
Updated tests to reflect API changes 2019-12-10 00:00:15 -05:00
Emi Simpson 65d0eab793
Use a shared client initialized in main.rs 2019-12-09 22:17:54 -05:00
Emi Simpson 292221ea28
Lazyloaded images are now loaded at compilation, with placeholders omitted 2019-12-09 19:40:29 -05:00
Emi Simpson 614af44c92
Gramatical and stylistic fixes 2019-12-09 13:58:12 -05:00
Emi Simpson feb37f5812
Added support for lazy loaded images
Note: The way this patch works is by resolving any data-src tags on images in
the same way as normal source tags are resolved.  It is assumed that most
lazy-load libraries will use this tag, and that if this tag is set, then it is a
URL that is in use.
2019-12-06 19:27:41 -05:00
Emi Simpson 028beb821c
Rustfmt update for nightly formatter 2019-12-06 16:46:52 -05:00
Emi Simpson 76ccff80f9
Fixed failure of regex to match @imports 2019-12-06 16:15:34 -05:00
Emi Simpson 45335d7507
Support links in style= attributes 2019-12-06 15:28:08 -05:00
Emi Simpson a4743ca92f
Respect the --no-images flag while parsing CSS 2019-12-06 15:00:06 -05:00
Emi Simpson b96a777e8a
Merge commit '4decea7' into load-css-imports 2019-12-06 13:56:36 -05:00
Emi Simpson 4decea716c
Fixed css replacement with more than one linked asset 2019-12-06 13:55:43 -05:00
Emi Simpson 695a787206
Moved regex compilation to lazy_static 2019-12-06 13:53:44 -05:00
Emi Simpson 90e6cb1c45
Prevent crash on URLs delimited by single quotes 2019-12-06 11:52:31 -05:00
Emi Simpson 7412d663e0
Use a slightly more efficient .replace_range() instead of cloning the string twice 2019-12-06 11:37:05 -05:00
Emi Simpson 8646af6e9f
removed debug code (woops sorry) 2019-12-06 10:52:20 -05:00
Emi Simpson de383c94b1
Applied rustfmt 2019-12-05 20:41:43 -05:00
Emi Simpson ab65b44f0d
Cleaned up some overcomplicated code 2019-12-05 20:22:39 -05:00
Emi Simpson 13bacb4320
EMPTY_STRING no longer used 2019-12-05 20:11:19 -05:00
Emi Simpson d574e9a5da
Added support for <style> tags 2019-12-05 20:05:52 -05:00
Emi Simpson 1de0fc0961
Add warning and fallback when parsing a rel=stylesheet link 2019-12-05 19:10:47 -05:00
Emi Simpson ebbf755e09
Fixed misleading variable name 2019-12-05 19:02:11 -05:00
Emi Simpson d3956a7905
Made merge compatible with Y2Z/master 2019-12-05 19:01:03 -05:00
Emi Simpson ef7ddcd434
Added fallback to absolute URL on failure to resolve CSS stylesheet @imports 2019-12-05 18:37:37 -05:00
Emi Simpson 11bbfc0851
Added support for recursively nested css @imports 2019-12-05 18:15:06 -05:00
Emi Simpson a2bf7e3345
Fixed some errors detecting, parsing, and transforming urls in `resolve_css_imports` 2019-12-05 17:42:07 -05:00
voila 1ff5e91087 Use HashMap as cache to minimize the number of HTTP requests (#75)
Use HashMap as cache to minimize the number of HTTP requests
2019-10-22 18:33:22 -04:00
knidarkness 550e4cc83f Fixed formatting 2019-10-12 14:05:07 +03:00
knidarkness 5443c0cc3f Added loading of the links given as url(...) in css files 2019-10-12 12:32:59 +03:00
robatipoor 55fe523a1c refactor utils functions 2019-10-10 16:53:00 +03:30
robatipoor 2e48ea90e1 move argument parser section to args mod 2019-10-10 08:58:12 +03:30
Sunshine 0896f2e214 Properly handle 30x redirects 2019-09-30 23:58:09 -04:00
Vincent Flyson 3948ea3aa0 Improve code structure 2019-09-29 17:15:49 -04:00
Vincent Flyson eec05767cf Add support for poster attribute 2019-09-22 12:57:50 -04:00
Vincent Flyson 88a230872c Add CSP isolation, no CSS, and no iframe options 2019-09-21 22:59:03 -04:00
Vincent Flyson 04cbbefafa Ignore empty src in images, accept fluid icons 2019-09-08 02:51:53 -04:00
Vincent Flyson 824b418f80 Add more tests 2019-08-30 22:18:14 -04:00
Vincent Flyson 2251c086c2
Merge branch 'master' into ignore-ca 2019-08-27 06:55:00 -04:00
Vincent Flyson 02b717ae54 Add flag to ignore errors related to TLS certificates 2019-08-26 23:17:36 -04:00
Vincent Flyson 1329dbe6f8 Ignore iframes with empty src 2019-08-26 22:57:10 -04:00
Vincent Flyson 2b96f9a32a Add flag for silent mode 2019-08-25 11:41:30 -04:00
Vincent Flyson 50f1ba1ce8 Treat network errors as empty results 2019-08-24 23:06:40 -04:00
Vincent Flyson 0b2e1d0746 Remove compiler warning 2019-08-24 20:23:53 -04:00
Vincent Flyson 2c0037fd51
Merge pull request #37 from Y2Z/ignore-other-protocols
Avoid modifying non-HTTP anchor hrefs
2019-08-24 20:07:25 -04:00
Vincent Flyson 3f79068d7d Avoid modifying non-HTTP anchor hrefs 2019-08-24 14:54:23 -04:00
Vincent Flyson 080ed50264
Merge pull request #36 from Y2Z/no-images-no-icons
Ignore icons if told to avoid images
2019-08-24 14:49:26 -04:00
Vincent Flyson b596b5bb7d Ignore icons if told to avoid images 2019-08-24 14:22:34 -04:00
Vincent Flyson c480fcb70f Add basic verbose output 2019-08-24 13:33:24 -04:00
Vincent Flyson e2ab05a323 Fix name of a test function 2019-08-24 12:06:03 -04:00
Vincent Flyson d49e825777 Add support for picture elements 2019-08-24 11:21:29 -04:00
Vincent Flyson d8d6437a15
Merge branch 'master' into author-robatipoor 2019-08-23 23:24:07 -04:00
Vincent Flyson c34d77d5d8 Revamp resolve_url() and improve code format 2019-08-23 23:06:14 -04:00
Vincent Flyson 2be5b1c235 Add support for iframes 2019-08-23 20:16:16 -04:00
Vincent Flyson c0fffbb212 Get rid of mime-sniffer dependency 2019-08-23 18:48:08 -04:00
Vincent Flyson 13429e32d3 Allow HTTP redirects and preserve email links 2019-08-23 16:00:05 -04:00
Vincent Flyson d3497b5db1
Merge branch 'master' into author-robatipoor 2019-08-23 15:09:46 -04:00
Vincent Flyson 7e298b0b02 Add -u flag for custom User-Agent 2019-08-23 15:00:56 -04:00
Vincent Flyson 7b326ee9f5 Add robatipoor to list of authors 2019-08-23 14:44:16 -04:00
mahdi 14e9e6facc cargo clippy 2019-08-23 22:54:45 +04:30
Vincent Flyson 54fdd890f9 Parse Cargo.toml for strings 2019-08-23 05:49:14 -04:00
Vincent Flyson 54ae61b728
Merge pull request #15 from Y2Z/increase-timeout
Increase request timeout to 10 seconds
2019-08-23 05:23:12 -04:00
Vincent Flyson 1480b20cb6
Merge pull request #14 from robatipoor/master
some code refactor
2019-08-23 05:21:27 -04:00
Vincent Flyson fde715a27e Increase request timeout to 10 seconds 2019-08-23 05:08:38 -04:00
mahdi b25b6765c1 some code refactor 2019-08-23 13:19:29 +04:30
Vincent Flyson a891782ce4 Add option to exclude images 2019-08-23 04:33:30 -04:00
Vincent Flyson ffa60a0ce2 Convert action attribute to full URL for FORM tags 2019-08-23 03:28:55 -04:00
Vincent Flyson cc09c2ba01 Remove leftover line 2019-08-22 23:21:28 -04:00
Vincent Flyson 143c396c71 Rewrite program in Rust 2019-08-22 23:17:15 -04:00