commit 9c7f9cfea032800e62e52ecf87b651784c1d838c Author: Marek Goc Date: Tue Oct 17 17:51:53 2023 +0200 first commit diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000..8a6e05c --- /dev/null +++ b/.gitattributes @@ -0,0 +1,6 @@ +# Enforce LF for Netpbm formats on Windows +testdata/test.pam text eol=lf +testdata/test.pbm text eol=lf +testdata/test.pfm text eol=lf +testdata/test.pgm text eol=lf +testdata/test.ppm text eol=lf diff --git a/AUTHORS b/AUTHORS new file mode 100644 index 0000000..28cd2c5 --- /dev/null +++ b/AUTHORS @@ -0,0 +1 @@ +Milan Nikolic diff --git a/COPYING b/COPYING new file mode 100644 index 0000000..dba13ed --- /dev/null +++ b/COPYING @@ -0,0 +1,661 @@ + GNU AFFERO GENERAL PUBLIC LICENSE + Version 3, 19 November 2007 + + Copyright (C) 2007 Free Software Foundation, Inc. + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The GNU Affero General Public License is a free, copyleft license for +software and other kinds of works, specifically designed to ensure +cooperation with the community in the case of network server software. + + The licenses for most software and other practical works are designed +to take away your freedom to share and change the works. By contrast, +our General Public Licenses are intended to guarantee your freedom to +share and change all versions of a program--to make sure it remains free +software for all its users. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +them if you wish), that you receive source code or can get it if you +want it, that you can change the software or use pieces of it in new +free programs, and that you know you can do these things. + + Developers that use our General Public Licenses protect your rights +with two steps: (1) assert copyright on the software, and (2) offer +you this License which gives you legal permission to copy, distribute +and/or modify the software. + + A secondary benefit of defending all users' freedom is that +improvements made in alternate versions of the program, if they +receive widespread use, become available for other developers to +incorporate. Many developers of free software are heartened and +encouraged by the resulting cooperation. However, in the case of +software used on network servers, this result may fail to come about. +The GNU General Public License permits making a modified version and +letting the public access it on a server without ever releasing its +source code to the public. + + The GNU Affero General Public License is designed specifically to +ensure that, in such cases, the modified source code becomes available +to the community. It requires the operator of a network server to +provide the source code of the modified version running there to the +users of that server. Therefore, public use of a modified version, on +a publicly accessible server, gives the public access to the source +code of the modified version. + + An older license, called the Affero General Public License and +published by Affero, was designed to accomplish similar goals. This is +a different license, not a version of the Affero GPL, but Affero has +released a new version of the Affero GPL which permits relicensing under +this license. + + The precise terms and conditions for copying, distribution and +modification follow. + + TERMS AND CONDITIONS + + 0. Definitions. + + "This License" refers to version 3 of the GNU Affero General Public License. + + "Copyright" also means copyright-like laws that apply to other kinds of +works, such as semiconductor masks. + + "The Program" refers to any copyrightable work licensed under this +License. Each licensee is addressed as "you". "Licensees" and +"recipients" may be individuals or organizations. + + To "modify" a work means to copy from or adapt all or part of the work +in a fashion requiring copyright permission, other than the making of an +exact copy. The resulting work is called a "modified version" of the +earlier work or a work "based on" the earlier work. + + A "covered work" means either the unmodified Program or a work based +on the Program. + + To "propagate" a work means to do anything with it that, without +permission, would make you directly or secondarily liable for +infringement under applicable copyright law, except executing it on a +computer or modifying a private copy. Propagation includes copying, +distribution (with or without modification), making available to the +public, and in some countries other activities as well. + + To "convey" a work means any kind of propagation that enables other +parties to make or receive copies. Mere interaction with a user through +a computer network, with no transfer of a copy, is not conveying. + + An interactive user interface displays "Appropriate Legal Notices" +to the extent that it includes a convenient and prominently visible +feature that (1) displays an appropriate copyright notice, and (2) +tells the user that there is no warranty for the work (except to the +extent that warranties are provided), that licensees may convey the +work under this License, and how to view a copy of this License. If +the interface presents a list of user commands or options, such as a +menu, a prominent item in the list meets this criterion. + + 1. Source Code. + + The "source code" for a work means the preferred form of the work +for making modifications to it. "Object code" means any non-source +form of a work. + + A "Standard Interface" means an interface that either is an official +standard defined by a recognized standards body, or, in the case of +interfaces specified for a particular programming language, one that +is widely used among developers working in that language. + + The "System Libraries" of an executable work include anything, other +than the work as a whole, that (a) is included in the normal form of +packaging a Major Component, but which is not part of that Major +Component, and (b) serves only to enable use of the work with that +Major Component, or to implement a Standard Interface for which an +implementation is available to the public in source code form. A +"Major Component", in this context, means a major essential component +(kernel, window system, and so on) of the specific operating system +(if any) on which the executable work runs, or a compiler used to +produce the work, or an object code interpreter used to run it. + + The "Corresponding Source" for a work in object code form means all +the source code needed to generate, install, and (for an executable +work) run the object code and to modify the work, including scripts to +control those activities. However, it does not include the work's +System Libraries, or general-purpose tools or generally available free +programs which are used unmodified in performing those activities but +which are not part of the work. For example, Corresponding Source +includes interface definition files associated with source files for +the work, and the source code for shared libraries and dynamically +linked subprograms that the work is specifically designed to require, +such as by intimate data communication or control flow between those +subprograms and other parts of the work. + + The Corresponding Source need not include anything that users +can regenerate automatically from other parts of the Corresponding +Source. + + The Corresponding Source for a work in source code form is that +same work. + + 2. Basic Permissions. + + All rights granted under this License are granted for the term of +copyright on the Program, and are irrevocable provided the stated +conditions are met. This License explicitly affirms your unlimited +permission to run the unmodified Program. The output from running a +covered work is covered by this License only if the output, given its +content, constitutes a covered work. This License acknowledges your +rights of fair use or other equivalent, as provided by copyright law. + + You may make, run and propagate covered works that you do not +convey, without conditions so long as your license otherwise remains +in force. You may convey covered works to others for the sole purpose +of having them make modifications exclusively for you, or provide you +with facilities for running those works, provided that you comply with +the terms of this License in conveying all material for which you do +not control copyright. Those thus making or running the covered works +for you must do so exclusively on your behalf, under your direction +and control, on terms that prohibit them from making any copies of +your copyrighted material outside their relationship with you. + + Conveying under any other circumstances is permitted solely under +the conditions stated below. Sublicensing is not allowed; section 10 +makes it unnecessary. + + 3. Protecting Users' Legal Rights From Anti-Circumvention Law. + + No covered work shall be deemed part of an effective technological +measure under any applicable law fulfilling obligations under article +11 of the WIPO copyright treaty adopted on 20 December 1996, or +similar laws prohibiting or restricting circumvention of such +measures. + + When you convey a covered work, you waive any legal power to forbid +circumvention of technological measures to the extent such circumvention +is effected by exercising rights under this License with respect to +the covered work, and you disclaim any intention to limit operation or +modification of the work as a means of enforcing, against the work's +users, your or third parties' legal rights to forbid circumvention of +technological measures. + + 4. Conveying Verbatim Copies. + + You may convey verbatim copies of the Program's source code as you +receive it, in any medium, provided that you conspicuously and +appropriately publish on each copy an appropriate copyright notice; +keep intact all notices stating that this License and any +non-permissive terms added in accord with section 7 apply to the code; +keep intact all notices of the absence of any warranty; and give all +recipients a copy of this License along with the Program. + + You may charge any price or no price for each copy that you convey, +and you may offer support or warranty protection for a fee. + + 5. Conveying Modified Source Versions. + + You may convey a work based on the Program, or the modifications to +produce it from the Program, in the form of source code under the +terms of section 4, provided that you also meet all of these conditions: + + a) The work must carry prominent notices stating that you modified + it, and giving a relevant date. + + b) The work must carry prominent notices stating that it is + released under this License and any conditions added under section + 7. This requirement modifies the requirement in section 4 to + "keep intact all notices". + + c) You must license the entire work, as a whole, under this + License to anyone who comes into possession of a copy. This + License will therefore apply, along with any applicable section 7 + additional terms, to the whole of the work, and all its parts, + regardless of how they are packaged. This License gives no + permission to license the work in any other way, but it does not + invalidate such permission if you have separately received it. + + d) If the work has interactive user interfaces, each must display + Appropriate Legal Notices; however, if the Program has interactive + interfaces that do not display Appropriate Legal Notices, your + work need not make them do so. + + A compilation of a covered work with other separate and independent +works, which are not by their nature extensions of the covered work, +and which are not combined with it such as to form a larger program, +in or on a volume of a storage or distribution medium, is called an +"aggregate" if the compilation and its resulting copyright are not +used to limit the access or legal rights of the compilation's users +beyond what the individual works permit. Inclusion of a covered work +in an aggregate does not cause this License to apply to the other +parts of the aggregate. + + 6. Conveying Non-Source Forms. + + You may convey a covered work in object code form under the terms +of sections 4 and 5, provided that you also convey the +machine-readable Corresponding Source under the terms of this License, +in one of these ways: + + a) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by the + Corresponding Source fixed on a durable physical medium + customarily used for software interchange. + + b) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by a + written offer, valid for at least three years and valid for as + long as you offer spare parts or customer support for that product + model, to give anyone who possesses the object code either (1) a + copy of the Corresponding Source for all the software in the + product that is covered by this License, on a durable physical + medium customarily used for software interchange, for a price no + more than your reasonable cost of physically performing this + conveying of source, or (2) access to copy the + Corresponding Source from a network server at no charge. + + c) Convey individual copies of the object code with a copy of the + written offer to provide the Corresponding Source. This + alternative is allowed only occasionally and noncommercially, and + only if you received the object code with such an offer, in accord + with subsection 6b. + + d) Convey the object code by offering access from a designated + place (gratis or for a charge), and offer equivalent access to the + Corresponding Source in the same way through the same place at no + further charge. You need not require recipients to copy the + Corresponding Source along with the object code. If the place to + copy the object code is a network server, the Corresponding Source + may be on a different server (operated by you or a third party) + that supports equivalent copying facilities, provided you maintain + clear directions next to the object code saying where to find the + Corresponding Source. Regardless of what server hosts the + Corresponding Source, you remain obligated to ensure that it is + available for as long as needed to satisfy these requirements. + + e) Convey the object code using peer-to-peer transmission, provided + you inform other peers where the object code and Corresponding + Source of the work are being offered to the general public at no + charge under subsection 6d. + + A separable portion of the object code, whose source code is excluded +from the Corresponding Source as a System Library, need not be +included in conveying the object code work. + + A "User Product" is either (1) a "consumer product", which means any +tangible personal property which is normally used for personal, family, +or household purposes, or (2) anything designed or sold for incorporation +into a dwelling. In determining whether a product is a consumer product, +doubtful cases shall be resolved in favor of coverage. For a particular +product received by a particular user, "normally used" refers to a +typical or common use of that class of product, regardless of the status +of the particular user or of the way in which the particular user +actually uses, or expects or is expected to use, the product. A product +is a consumer product regardless of whether the product has substantial +commercial, industrial or non-consumer uses, unless such uses represent +the only significant mode of use of the product. + + "Installation Information" for a User Product means any methods, +procedures, authorization keys, or other information required to install +and execute modified versions of a covered work in that User Product from +a modified version of its Corresponding Source. The information must +suffice to ensure that the continued functioning of the modified object +code is in no case prevented or interfered with solely because +modification has been made. + + If you convey an object code work under this section in, or with, or +specifically for use in, a User Product, and the conveying occurs as +part of a transaction in which the right of possession and use of the +User Product is transferred to the recipient in perpetuity or for a +fixed term (regardless of how the transaction is characterized), the +Corresponding Source conveyed under this section must be accompanied +by the Installation Information. But this requirement does not apply +if neither you nor any third party retains the ability to install +modified object code on the User Product (for example, the work has +been installed in ROM). + + The requirement to provide Installation Information does not include a +requirement to continue to provide support service, warranty, or updates +for a work that has been modified or installed by the recipient, or for +the User Product in which it has been modified or installed. Access to a +network may be denied when the modification itself materially and +adversely affects the operation of the network or violates the rules and +protocols for communication across the network. + + Corresponding Source conveyed, and Installation Information provided, +in accord with this section must be in a format that is publicly +documented (and with an implementation available to the public in +source code form), and must require no special password or key for +unpacking, reading or copying. + + 7. Additional Terms. + + "Additional permissions" are terms that supplement the terms of this +License by making exceptions from one or more of its conditions. +Additional permissions that are applicable to the entire Program shall +be treated as though they were included in this License, to the extent +that they are valid under applicable law. If additional permissions +apply only to part of the Program, that part may be used separately +under those permissions, but the entire Program remains governed by +this License without regard to the additional permissions. + + When you convey a copy of a covered work, you may at your option +remove any additional permissions from that copy, or from any part of +it. (Additional permissions may be written to require their own +removal in certain cases when you modify the work.) You may place +additional permissions on material, added by you to a covered work, +for which you have or can give appropriate copyright permission. + + Notwithstanding any other provision of this License, for material you +add to a covered work, you may (if authorized by the copyright holders of +that material) supplement the terms of this License with terms: + + a) Disclaiming warranty or limiting liability differently from the + terms of sections 15 and 16 of this License; or + + b) Requiring preservation of specified reasonable legal notices or + author attributions in that material or in the Appropriate Legal + Notices displayed by works containing it; or + + c) Prohibiting misrepresentation of the origin of that material, or + requiring that modified versions of such material be marked in + reasonable ways as different from the original version; or + + d) Limiting the use for publicity purposes of names of licensors or + authors of the material; or + + e) Declining to grant rights under trademark law for use of some + trade names, trademarks, or service marks; or + + f) Requiring indemnification of licensors and authors of that + material by anyone who conveys the material (or modified versions of + it) with contractual assumptions of liability to the recipient, for + any liability that these contractual assumptions directly impose on + those licensors and authors. + + All other non-permissive additional terms are considered "further +restrictions" within the meaning of section 10. If the Program as you +received it, or any part of it, contains a notice stating that it is +governed by this License along with a term that is a further +restriction, you may remove that term. If a license document contains +a further restriction but permits relicensing or conveying under this +License, you may add to a covered work material governed by the terms +of that license document, provided that the further restriction does +not survive such relicensing or conveying. + + If you add terms to a covered work in accord with this section, you +must place, in the relevant source files, a statement of the +additional terms that apply to those files, or a notice indicating +where to find the applicable terms. + + Additional terms, permissive or non-permissive, may be stated in the +form of a separately written license, or stated as exceptions; +the above requirements apply either way. + + 8. Termination. + + You may not propagate or modify a covered work except as expressly +provided under this License. Any attempt otherwise to propagate or +modify it is void, and will automatically terminate your rights under +this License (including any patent licenses granted under the third +paragraph of section 11). + + However, if you cease all violation of this License, then your +license from a particular copyright holder is reinstated (a) +provisionally, unless and until the copyright holder explicitly and +finally terminates your license, and (b) permanently, if the copyright +holder fails to notify you of the violation by some reasonable means +prior to 60 days after the cessation. + + Moreover, your license from a particular copyright holder is +reinstated permanently if the copyright holder notifies you of the +violation by some reasonable means, this is the first time you have +received notice of violation of this License (for any work) from that +copyright holder, and you cure the violation prior to 30 days after +your receipt of the notice. + + Termination of your rights under this section does not terminate the +licenses of parties who have received copies or rights from you under +this License. If your rights have been terminated and not permanently +reinstated, you do not qualify to receive new licenses for the same +material under section 10. + + 9. Acceptance Not Required for Having Copies. + + You are not required to accept this License in order to receive or +run a copy of the Program. Ancillary propagation of a covered work +occurring solely as a consequence of using peer-to-peer transmission +to receive a copy likewise does not require acceptance. However, +nothing other than this License grants you permission to propagate or +modify any covered work. These actions infringe copyright if you do +not accept this License. Therefore, by modifying or propagating a +covered work, you indicate your acceptance of this License to do so. + + 10. Automatic Licensing of Downstream Recipients. + + Each time you convey a covered work, the recipient automatically +receives a license from the original licensors, to run, modify and +propagate that work, subject to this License. You are not responsible +for enforcing compliance by third parties with this License. + + An "entity transaction" is a transaction transferring control of an +organization, or substantially all assets of one, or subdividing an +organization, or merging organizations. If propagation of a covered +work results from an entity transaction, each party to that +transaction who receives a copy of the work also receives whatever +licenses to the work the party's predecessor in interest had or could +give under the previous paragraph, plus a right to possession of the +Corresponding Source of the work from the predecessor in interest, if +the predecessor has it or can get it with reasonable efforts. + + You may not impose any further restrictions on the exercise of the +rights granted or affirmed under this License. For example, you may +not impose a license fee, royalty, or other charge for exercise of +rights granted under this License, and you may not initiate litigation +(including a cross-claim or counterclaim in a lawsuit) alleging that +any patent claim is infringed by making, using, selling, offering for +sale, or importing the Program or any portion of it. + + 11. Patents. + + A "contributor" is a copyright holder who authorizes use under this +License of the Program or a work on which the Program is based. The +work thus licensed is called the contributor's "contributor version". + + A contributor's "essential patent claims" are all patent claims +owned or controlled by the contributor, whether already acquired or +hereafter acquired, that would be infringed by some manner, permitted +by this License, of making, using, or selling its contributor version, +but do not include claims that would be infringed only as a +consequence of further modification of the contributor version. For +purposes of this definition, "control" includes the right to grant +patent sublicenses in a manner consistent with the requirements of +this License. + + Each contributor grants you a non-exclusive, worldwide, royalty-free +patent license under the contributor's essential patent claims, to +make, use, sell, offer for sale, import and otherwise run, modify and +propagate the contents of its contributor version. + + In the following three paragraphs, a "patent license" is any express +agreement or commitment, however denominated, not to enforce a patent +(such as an express permission to practice a patent or covenant not to +sue for patent infringement). To "grant" such a patent license to a +party means to make such an agreement or commitment not to enforce a +patent against the party. + + If you convey a covered work, knowingly relying on a patent license, +and the Corresponding Source of the work is not available for anyone +to copy, free of charge and under the terms of this License, through a +publicly available network server or other readily accessible means, +then you must either (1) cause the Corresponding Source to be so +available, or (2) arrange to deprive yourself of the benefit of the +patent license for this particular work, or (3) arrange, in a manner +consistent with the requirements of this License, to extend the patent +license to downstream recipients. "Knowingly relying" means you have +actual knowledge that, but for the patent license, your conveying the +covered work in a country, or your recipient's use of the covered work +in a country, would infringe one or more identifiable patents in that +country that you have reason to believe are valid. + + If, pursuant to or in connection with a single transaction or +arrangement, you convey, or propagate by procuring conveyance of, a +covered work, and grant a patent license to some of the parties +receiving the covered work authorizing them to use, propagate, modify +or convey a specific copy of the covered work, then the patent license +you grant is automatically extended to all recipients of the covered +work and works based on it. + + A patent license is "discriminatory" if it does not include within +the scope of its coverage, prohibits the exercise of, or is +conditioned on the non-exercise of one or more of the rights that are +specifically granted under this License. You may not convey a covered +work if you are a party to an arrangement with a third party that is +in the business of distributing software, under which you make payment +to the third party based on the extent of your activity of conveying +the work, and under which the third party grants, to any of the +parties who would receive the covered work from you, a discriminatory +patent license (a) in connection with copies of the covered work +conveyed by you (or copies made from those copies), or (b) primarily +for and in connection with specific products or compilations that +contain the covered work, unless you entered into that arrangement, +or that patent license was granted, prior to 28 March 2007. + + Nothing in this License shall be construed as excluding or limiting +any implied license or other defenses to infringement that may +otherwise be available to you under applicable patent law. + + 12. No Surrender of Others' Freedom. + + If conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot convey a +covered work so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you may +not convey it at all. For example, if you agree to terms that obligate you +to collect a royalty for further conveying from those to whom you convey +the Program, the only way you could satisfy both those terms and this +License would be to refrain entirely from conveying the Program. + + 13. Remote Network Interaction; Use with the GNU General Public License. + + Notwithstanding any other provision of this License, if you modify the +Program, your modified version must prominently offer all users +interacting with it remotely through a computer network (if your version +supports such interaction) an opportunity to receive the Corresponding +Source of your version by providing access to the Corresponding Source +from a network server at no charge, through some standard or customary +means of facilitating copying of software. This Corresponding Source +shall include the Corresponding Source for any work covered by version 3 +of the GNU General Public License that is incorporated pursuant to the +following paragraph. + + Notwithstanding any other provision of this License, you have +permission to link or combine any covered work with a work licensed +under version 3 of the GNU General Public License into a single +combined work, and to convey the resulting work. The terms of this +License will continue to apply to the part which is the covered work, +but the work with which it is combined will remain governed by version +3 of the GNU General Public License. + + 14. Revised Versions of this License. + + The Free Software Foundation may publish revised and/or new versions of +the GNU Affero General Public License from time to time. Such new versions +will be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + + Each version is given a distinguishing version number. If the +Program specifies that a certain numbered version of the GNU Affero General +Public License "or any later version" applies to it, you have the +option of following the terms and conditions either of that numbered +version or of any later version published by the Free Software +Foundation. If the Program does not specify a version number of the +GNU Affero General Public License, you may choose any version ever published +by the Free Software Foundation. + + If the Program specifies that a proxy can decide which future +versions of the GNU Affero General Public License can be used, that proxy's +public statement of acceptance of a version permanently authorizes you +to choose that version for the Program. + + Later license versions may give you additional or different +permissions. However, no additional obligations are imposed on any +author or copyright holder as a result of your choosing to follow a +later version. + + 15. Disclaimer of Warranty. + + THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY +APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT +HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY +OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, +THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM +IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF +ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + + 16. Limitation of Liability. + + IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS +THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY +GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE +USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF +DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD +PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), +EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF +SUCH DAMAGES. + + 17. Interpretation of Sections 15 and 16. + + If the disclaimer of warranty and limitation of liability provided +above cannot be given local legal effect according to their terms, +reviewing courts shall apply local law that most closely approximates +an absolute waiver of all civil liability in connection with the +Program, unless a warranty or assumption of liability accompanies a +copy of the Program in return for a fee. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +state the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU Affero General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU Affero General Public License for more details. + + You should have received a copy of the GNU Affero General Public License + along with this program. If not, see . + +Also add information on how to contact you by electronic and paper mail. + + If your software can interact with users remotely through a computer +network, you should also make sure that it provides a way for users to +get its source. For example, if your program is a web application, its +interface could display a "Source" link that leads users to an archive +of the code. There are many ways you could offer source, and different +solutions will be better for different programs; see section 13 for the +specific requirements. + + You should also get your employer (if you work as a programmer) or school, +if any, to sign a "copyright disclaimer" for the program, if necessary. +For more information on this, and how to apply and follow the GNU AGPL, see +. diff --git a/README.md b/README.md new file mode 100644 index 0000000..e8a20ed --- /dev/null +++ b/README.md @@ -0,0 +1,71 @@ +## go-fitz + +### Maal Disclaimer +This is just fork from github repository. Forked in order to recompile with new glibc version. + +Whenever you have problem with compability you can download source package from [Mupdf source](https://mupdf.com/downloads/archive/mupdf-1.23.0-source.tar.gz) after simply run ``` make``` to compile on your platform. When this is completed just simply copy following files: + +```libmupdf.a``` ==> ```libs/libmupdfthird_linux_amd64.a``` +```libmupdf-third.a``` ==> ```libs/libmupdf_linux_amd64.a``` + +Part of file name 'linux_amd64' maens OS and platform. +##### Notice that you need to compile MuPdf source for OS and platform you want to use + +### --------------------------------------------------------------------------------------- + +Go wrapper for [MuPDF](http://mupdf.com/) fitz library that can extract pages from PDF and EPUB documents as images, text, html or svg. + +### Build tags + +* `extlib` - use external MuPDF library +* `static` - build with static external MuPDF library (used with `extlib`) +* `pkgconfig` - enable pkg-config (used with `extlib`) +* `musl` - use musl compiled library + +### Example +```go +package main + +import ( + "fmt" + "image/jpeg" + "os" + "path/filepath" + + "github.com/gen2brain/go-fitz" +) + +func main() { + doc, err := fitz.New("test.pdf") + if err != nil { + panic(err) + } + + defer doc.Close() + + tmpDir, err := os.MkdirTemp(os.TempDir(), "fitz") + if err != nil { + panic(err) + } + + // Extract pages as images + for n := 0; n < doc.NumPage(); n++ { + img, err := doc.Image(n) + if err != nil { + panic(err) + } + + f, err := os.Create(filepath.Join(tmpDir, fmt.Sprintf("test%03d.jpg", n))) + if err != nil { + panic(err) + } + + err = jpeg.Encode(f, img, &jpeg.Options{jpeg.DefaultQuality}) + if err != nil { + panic(err) + } + + f.Close() + } +} +``` diff --git a/example_test.go b/example_test.go new file mode 100644 index 0000000..d77024b --- /dev/null +++ b/example_test.go @@ -0,0 +1,104 @@ +package fitz_test + +import ( + "fmt" + "image/jpeg" + "os" + "path/filepath" + + "git.ma-al.com/go-fitz" +) + +func ExampleNew() { + doc, err := fitz.New("test.pdf") + if err != nil { + panic(err) + } + + defer doc.Close() + + tmpDir, err := os.MkdirTemp(os.TempDir(), "fitz") + if err != nil { + panic(err) + } + + // Extract pages as images + for n := 0; n < doc.NumPage(); n++ { + img, err := doc.Image(n) + if err != nil { + panic(err) + } + + f, err := os.Create(filepath.Join(tmpDir, fmt.Sprintf("test%03d.jpg", n))) + if err != nil { + panic(err) + } + + err = jpeg.Encode(f, img, &jpeg.Options{Quality: jpeg.DefaultQuality}) + if err != nil { + panic(err) + } + + f.Close() + } + + // Extract pages as text + for n := 0; n < doc.NumPage(); n++ { + text, err := doc.Text(n) + if err != nil { + panic(err) + } + + f, err := os.Create(filepath.Join(tmpDir, fmt.Sprintf("test%03d.txt", n))) + if err != nil { + panic(err) + } + + _, err = f.WriteString(text) + if err != nil { + panic(err) + } + + f.Close() + } + + // Extract pages as html + for n := 0; n < doc.NumPage(); n++ { + html, err := doc.HTML(n, true) + if err != nil { + panic(err) + } + + f, err := os.Create(filepath.Join(tmpDir, fmt.Sprintf("test%03d.html", n))) + if err != nil { + panic(err) + } + + _, err = f.WriteString(html) + if err != nil { + panic(err) + } + + f.Close() + } + + // Extract pages as svg + for n := 0; n < doc.NumPage(); n++ { + svg, err := doc.SVG(n) + if err != nil { + panic(err) + } + + f, err := os.Create(filepath.Join(tmpDir, fmt.Sprintf("test%03d.svg", n))) + if err != nil { + panic(err) + } + + _, err = f.WriteString(svg) + if err != nil { + panic(err) + } + + f.Close() + } +} diff --git a/fitz.go b/fitz.go new file mode 100644 index 0000000..a1140e4 --- /dev/null +++ b/fitz.go @@ -0,0 +1,559 @@ +// Package fitz provides wrapper for the [MuPDF](http://mupdf.com/) fitz library +// that can extract pages from PDF and EPUB documents as images, text, html or svg. +package fitz + +/* +#include +#include + +const char *fz_version = FZ_VERSION; + +fz_document *open_document(fz_context *ctx, const char *filename) { + fz_document *doc; + + fz_try(ctx) { + doc = fz_open_document(ctx, filename); + } + fz_catch(ctx) { + return NULL; + } + + return doc; +} + +fz_document *open_document_with_stream(fz_context *ctx, const char *magic, fz_stream *stream) { + fz_document *doc; + + fz_try(ctx) { + doc = fz_open_document_with_stream(ctx, magic, stream); + } + fz_catch(ctx) { + return NULL; + } + + return doc; +} +*/ +import "C" + +import ( + "errors" + "image" + "io" + "os" + "path/filepath" + "sync" + "unsafe" +) + +// Errors. +var ( + ErrNoSuchFile = errors.New("fitz: no such file") + ErrCreateContext = errors.New("fitz: cannot create context") + ErrOpenDocument = errors.New("fitz: cannot open document") + ErrOpenMemory = errors.New("fitz: cannot open memory") + ErrPageMissing = errors.New("fitz: page missing") + ErrCreatePixmap = errors.New("fitz: cannot create pixmap") + ErrPixmapSamples = errors.New("fitz: cannot get pixmap samples") + ErrNeedsPassword = errors.New("fitz: document needs password") + ErrLoadOutline = errors.New("fitz: cannot load outline") +) + +// Document represents fitz document. +type Document struct { + ctx *C.struct_fz_context + data []byte // binds data to the Document lifecycle avoiding premature GC + doc *C.struct_fz_document + mtx sync.Mutex + stream *C.fz_stream +} + +// Outline type. +type Outline struct { + // Hierarchy level of the entry (starting from 1). + Level int + // Title of outline item. + Title string + // Destination in the document to be displayed when this outline item is activated. + URI string + // The page number of an internal link. + Page int + // Top. + Top float64 +} + +// Link type. +type Link struct { + URI string +} + +// New returns new fitz document. +func New(filename string) (f *Document, err error) { + f = &Document{} + + filename, err = filepath.Abs(filename) + if err != nil { + return + } + + if _, e := os.Stat(filename); e != nil { + err = ErrNoSuchFile + return + } + + f.ctx = (*C.struct_fz_context)(unsafe.Pointer(C.fz_new_context_imp(nil, nil, C.FZ_STORE_UNLIMITED, C.fz_version))) + if f.ctx == nil { + err = ErrCreateContext + return + } + + C.fz_register_document_handlers(f.ctx) + + cfilename := C.CString(filename) + defer C.free(unsafe.Pointer(cfilename)) + + f.doc = C.open_document(f.ctx, cfilename) + if f.doc == nil { + err = ErrOpenDocument + return + } + + ret := C.fz_needs_password(f.ctx, f.doc) + v := int(ret) != 0 + if v { + err = ErrNeedsPassword + } + + return +} + +// NewFromMemory returns new fitz document from byte slice. +func NewFromMemory(b []byte) (f *Document, err error) { + f = &Document{} + + f.ctx = (*C.struct_fz_context)(unsafe.Pointer(C.fz_new_context_imp(nil, nil, C.FZ_STORE_UNLIMITED, C.fz_version))) + if f.ctx == nil { + err = ErrCreateContext + return + } + + C.fz_register_document_handlers(f.ctx) + + stream := C.fz_open_memory(f.ctx, (*C.uchar)(&b[0]), C.size_t(len(b))) + f.stream = C.fz_keep_stream(f.ctx, stream) + + if f.stream == nil { + err = ErrOpenMemory + return + } + + magic := contentType(b) + if magic == "" { + err = ErrOpenMemory + return + } + + f.data = b + + cmagic := C.CString(magic) + defer C.free(unsafe.Pointer(cmagic)) + + f.doc = C.open_document_with_stream(f.ctx, cmagic, f.stream) + if f.doc == nil { + err = ErrOpenDocument + } + + ret := C.fz_needs_password(f.ctx, f.doc) + v := int(ret) != 0 + if v { + err = ErrNeedsPassword + } + + return +} + +// NewFromReader returns new fitz document from io.Reader. +func NewFromReader(r io.Reader) (f *Document, err error) { + b, e := io.ReadAll(r) + if e != nil { + err = e + return + } + + f, err = NewFromMemory(b) + + return +} + +// NumPage returns total number of pages in document. +func (f *Document) NumPage() int { + return int(C.fz_count_pages(f.ctx, f.doc)) +} + +// Image returns image for given page number. +func (f *Document) Image(pageNumber int) (image.Image, error) { + return f.ImageDPI(pageNumber, 300.0) +} + +// ImageDPI returns image for given page number and DPI. +func (f *Document) ImageDPI(pageNumber int, dpi float64) (image.Image, error) { + f.mtx.Lock() + defer f.mtx.Unlock() + + img := image.RGBA{} + + if pageNumber >= f.NumPage() { + return nil, ErrPageMissing + } + + page := C.fz_load_page(f.ctx, f.doc, C.int(pageNumber)) + defer C.fz_drop_page(f.ctx, page) + + var bounds C.fz_rect + bounds = C.fz_bound_page(f.ctx, page) + + var ctm C.fz_matrix + ctm = C.fz_scale(C.float(dpi/72), C.float(dpi/72)) + + var bbox C.fz_irect + bounds = C.fz_transform_rect(bounds, ctm) + bbox = C.fz_round_rect(bounds) + + pixmap := C.fz_new_pixmap_with_bbox(f.ctx, C.fz_device_rgb(f.ctx), bbox, nil, 1) + if pixmap == nil { + return nil, ErrCreatePixmap + } + + C.fz_clear_pixmap_with_value(f.ctx, pixmap, C.int(0xff)) + //defer C.fz_drop_pixmap(f.ctx, pixmap) + + device := C.fz_new_draw_device(f.ctx, ctm, pixmap) + C.fz_enable_device_hints(f.ctx, device, C.FZ_NO_CACHE) + defer C.fz_drop_device(f.ctx, device) + + drawMatrix := C.fz_identity + C.fz_run_page(f.ctx, page, device, drawMatrix, nil) + + C.fz_close_device(f.ctx, device) + + pixels := C.fz_pixmap_samples(f.ctx, pixmap) + if pixels == nil { + return nil, ErrPixmapSamples + } + defer C.free(unsafe.Pointer(pixels)) + + img.Pix = C.GoBytes(unsafe.Pointer(pixels), C.int(4*bbox.x1*bbox.y1)) + img.Rect = image.Rect(int(bbox.x0), int(bbox.y0), int(bbox.x1), int(bbox.y1)) + img.Stride = 4 * img.Rect.Max.X + + return &img, nil +} + +// ImagePNG returns image for given page number as PNG bytes. +func (f *Document) ImagePNG(pageNumber int, dpi float64) ([]byte, error) { + f.mtx.Lock() + defer f.mtx.Unlock() + + if pageNumber >= f.NumPage() { + return nil, ErrPageMissing + } + + page := C.fz_load_page(f.ctx, f.doc, C.int(pageNumber)) + defer C.fz_drop_page(f.ctx, page) + + var bounds C.fz_rect + bounds = C.fz_bound_page(f.ctx, page) + + var ctm C.fz_matrix + ctm = C.fz_scale(C.float(dpi/72), C.float(dpi/72)) + + var bbox C.fz_irect + bounds = C.fz_transform_rect(bounds, ctm) + bbox = C.fz_round_rect(bounds) + + pixmap := C.fz_new_pixmap_with_bbox(f.ctx, C.fz_device_rgb(f.ctx), bbox, nil, 1) + if pixmap == nil { + return nil, ErrCreatePixmap + } + + C.fz_clear_pixmap_with_value(f.ctx, pixmap, C.int(0xff)) + //defer C.fz_drop_pixmap(f.ctx, pixmap) + + device := C.fz_new_draw_device(f.ctx, ctm, pixmap) + C.fz_enable_device_hints(f.ctx, device, C.FZ_NO_CACHE) + defer C.fz_drop_device(f.ctx, device) + + drawMatrix := C.fz_identity + C.fz_run_page(f.ctx, page, device, drawMatrix, nil) + + C.fz_close_device(f.ctx, device) + + buf := C.fz_new_buffer_from_pixmap_as_png(f.ctx, pixmap, C.fz_default_color_params) + defer C.fz_drop_buffer(f.ctx, buf) + + size := C.fz_buffer_storage(f.ctx, buf, nil) + str := C.GoStringN(C.fz_string_from_buffer(f.ctx, buf), C.int(size)) + + return []byte(str), nil +} + +// Links returns slice of links for given page number. +func (f *Document) Links(pageNumber int) ([]Link, error) { + f.mtx.Lock() + defer f.mtx.Unlock() + + if pageNumber >= f.NumPage() { + return nil, ErrPageMissing + } + + page := C.fz_load_page(f.ctx, f.doc, C.int(pageNumber)) + defer C.fz_drop_page(f.ctx, page) + + links := C.fz_load_links(f.ctx, page) + defer C.fz_drop_link(f.ctx, links) + + linkCount := 0 + for currLink := links; currLink != nil; currLink = currLink.next { + linkCount++ + } + + if linkCount == 0 { + return nil, nil + } + + gLinks := make([]Link, linkCount) + + currLink := links + for i := 0; i < linkCount; i++ { + gLinks[i] = Link{ + URI: C.GoString(currLink.uri), + } + currLink = currLink.next + } + + return gLinks, nil +} + +// Text returns text for given page number. +func (f *Document) Text(pageNumber int) (string, error) { + f.mtx.Lock() + defer f.mtx.Unlock() + + if pageNumber >= f.NumPage() { + return "", ErrPageMissing + } + + page := C.fz_load_page(f.ctx, f.doc, C.int(pageNumber)) + defer C.fz_drop_page(f.ctx, page) + + var bounds C.fz_rect + bounds = C.fz_bound_page(f.ctx, page) + + var ctm C.fz_matrix + ctm = C.fz_scale(C.float(72.0/72), C.float(72.0/72)) + + text := C.fz_new_stext_page(f.ctx, bounds) + defer C.fz_drop_stext_page(f.ctx, text) + + var opts C.fz_stext_options + opts.flags = 0 + + device := C.fz_new_stext_device(f.ctx, text, &opts) + C.fz_enable_device_hints(f.ctx, device, C.FZ_NO_CACHE) + defer C.fz_drop_device(f.ctx, device) + + var cookie C.fz_cookie + C.fz_run_page(f.ctx, page, device, ctm, &cookie) + + C.fz_close_device(f.ctx, device) + + buf := C.fz_new_buffer_from_stext_page(f.ctx, text) + defer C.fz_drop_buffer(f.ctx, buf) + + str := C.GoString(C.fz_string_from_buffer(f.ctx, buf)) + + return str, nil +} + +// HTML returns html for given page number. +func (f *Document) HTML(pageNumber int, header bool) (string, error) { + f.mtx.Lock() + defer f.mtx.Unlock() + + if pageNumber >= f.NumPage() { + return "", ErrPageMissing + } + + page := C.fz_load_page(f.ctx, f.doc, C.int(pageNumber)) + defer C.fz_drop_page(f.ctx, page) + + var bounds C.fz_rect + bounds = C.fz_bound_page(f.ctx, page) + + var ctm C.fz_matrix + ctm = C.fz_scale(C.float(72.0/72), C.float(72.0/72)) + + text := C.fz_new_stext_page(f.ctx, bounds) + defer C.fz_drop_stext_page(f.ctx, text) + + var opts C.fz_stext_options + opts.flags = C.FZ_STEXT_PRESERVE_IMAGES + + device := C.fz_new_stext_device(f.ctx, text, &opts) + C.fz_enable_device_hints(f.ctx, device, C.FZ_NO_CACHE) + defer C.fz_drop_device(f.ctx, device) + + var cookie C.fz_cookie + C.fz_run_page(f.ctx, page, device, ctm, &cookie) + + C.fz_close_device(f.ctx, device) + + buf := C.fz_new_buffer(f.ctx, 1024) + defer C.fz_drop_buffer(f.ctx, buf) + + out := C.fz_new_output_with_buffer(f.ctx, buf) + defer C.fz_drop_output(f.ctx, out) + + if header { + C.fz_print_stext_header_as_html(f.ctx, out) + } + C.fz_print_stext_page_as_html(f.ctx, out, text, C.int(pageNumber)) + if header { + C.fz_print_stext_trailer_as_html(f.ctx, out) + } + + str := C.GoString(C.fz_string_from_buffer(f.ctx, buf)) + + return str, nil +} + +// SVG returns svg document for given page number. +func (f *Document) SVG(pageNumber int) (string, error) { + f.mtx.Lock() + defer f.mtx.Unlock() + + if pageNumber >= f.NumPage() { + return "", ErrPageMissing + } + + page := C.fz_load_page(f.ctx, f.doc, C.int(pageNumber)) + defer C.fz_drop_page(f.ctx, page) + + var bounds C.fz_rect + bounds = C.fz_bound_page(f.ctx, page) + + var ctm C.fz_matrix + ctm = C.fz_scale(C.float(72.0/72), C.float(72.0/72)) + bounds = C.fz_transform_rect(bounds, ctm) + + buf := C.fz_new_buffer(f.ctx, 1024) + defer C.fz_drop_buffer(f.ctx, buf) + + out := C.fz_new_output_with_buffer(f.ctx, buf) + defer C.fz_drop_output(f.ctx, out) + + device := C.fz_new_svg_device(f.ctx, out, bounds.x1-bounds.x0, bounds.y1-bounds.y0, C.FZ_SVG_TEXT_AS_PATH, 1) + C.fz_enable_device_hints(f.ctx, device, C.FZ_NO_CACHE) + defer C.fz_drop_device(f.ctx, device) + + var cookie C.fz_cookie + C.fz_run_page(f.ctx, page, device, ctm, &cookie) + + C.fz_close_device(f.ctx, device) + + str := C.GoString(C.fz_string_from_buffer(f.ctx, buf)) + + return str, nil +} + +// ToC returns the table of contents (also known as outline). +func (f *Document) ToC() ([]Outline, error) { + data := make([]Outline, 0) + + outline := C.fz_load_outline(f.ctx, f.doc) + if outline == nil { + return nil, ErrLoadOutline + } + defer C.fz_drop_outline(f.ctx, outline) + + var walk func(outline *C.fz_outline, level int) + + walk = func(outline *C.fz_outline, level int) { + for outline != nil { + res := Outline{} + res.Level = level + res.Title = C.GoString(outline.title) + res.URI = C.GoString(outline.uri) + res.Page = int(outline.page.page) + res.Top = float64(outline.y) + data = append(data, res) + + if outline.down != nil { + walk(outline.down, level+1) + } + outline = outline.next + } + } + + walk(outline, 1) + return data, nil +} + +// Metadata returns the map with standard metadata. +func (f *Document) Metadata() map[string]string { + data := make(map[string]string) + + lookup := func(key string) string { + ckey := C.CString(key) + defer C.free(unsafe.Pointer(ckey)) + + buf := make([]byte, 256) + C.fz_lookup_metadata(f.ctx, f.doc, ckey, (*C.char)(unsafe.Pointer(&buf[0])), C.int(len(buf))) + + return string(buf) + } + + data["format"] = lookup("format") + data["encryption"] = lookup("encryption") + data["title"] = lookup("info:Title") + data["author"] = lookup("info:Author") + data["subject"] = lookup("info:Subject") + data["keywords"] = lookup("info:Keywords") + data["creator"] = lookup("info:Creator") + data["producer"] = lookup("info:Producer") + data["creationDate"] = lookup("info:CreationDate") + data["modDate"] = lookup("info:modDate") + + return data +} + +// Bound gives the Bounds of a given Page in the document. +func (f *Document) Bound(pageNumber int) (image.Rectangle, error) { + f.mtx.Lock() + defer f.mtx.Unlock() + + if pageNumber >= f.NumPage() { + return image.Rectangle{}, ErrPageMissing + } + + page := C.fz_load_page(f.ctx, f.doc, C.int(pageNumber)) + defer C.fz_drop_page(f.ctx, page) + + var bounds C.fz_rect + bounds = C.fz_bound_page(f.ctx, page) + return image.Rect(int(bounds.x0), int(bounds.y0), int(bounds.x1), int(bounds.y1)), nil +} + +// Close closes the underlying fitz document. +func (f *Document) Close() error { + if f.stream != nil { + C.fz_drop_stream(f.ctx, f.stream) + } + + C.fz_drop_document(f.ctx, f.doc) + C.fz_drop_context(f.ctx) + + f.data = nil + + return nil +} diff --git a/fitz_cgo.go b/fitz_cgo.go new file mode 100644 index 0000000..18d7ade --- /dev/null +++ b/fitz_cgo.go @@ -0,0 +1,21 @@ +//go:build !extlib + +package fitz + +/* +#cgo CFLAGS: -Iinclude + +#cgo linux,386 LDFLAGS: -L${SRCDIR}/libs -lmupdf_linux_386 -lmupdfthird_linux_386 -lm +#cgo linux,amd64,!musl LDFLAGS: -L${SRCDIR}/libs -lmupdf_linux_amd64 -lmupdfthird_linux_amd64 -lm +#cgo linux,amd64,musl LDFLAGS: -L${SRCDIR}/libs -lmupdf_linux_amd64_musl -lmupdfthird_linux_amd64_musl -lm +#cgo linux,!android,arm LDFLAGS: -L${SRCDIR}/libs -lmupdf_linux_arm -lmupdfthird_linux_arm -lm +#cgo linux,!android,arm64,!musl LDFLAGS: -L${SRCDIR}/libs -lmupdf_linux_arm64 -lmupdfthird_linux_arm64 -lm +#cgo linux,!android,arm64,musl LDFLAGS: -L${SRCDIR}/libs -lmupdf_linux_arm64_musl -lmupdfthird_linux_arm64_musl -lm +#cgo android,arm LDFLAGS: -L${SRCDIR}/libs -lmupdf_android_arm -lmupdfthird_android_arm -lm -llog +#cgo android,arm64 LDFLAGS: -L${SRCDIR}/libs -lmupdf_android_arm64 -lmupdfthird_android_arm64 -lm -llog +#cgo windows,386 LDFLAGS: -L${SRCDIR}/libs -lmupdf_windows_386 -lmupdfthird_windows_386 -lm -lcomdlg32 -lgdi32 -lmsvcr90 -Wl,--allow-multiple-definition +#cgo windows,amd64 LDFLAGS: -L${SRCDIR}/libs -lmupdf_windows_amd64 -lmupdfthird_windows_amd64 -lm -lcomdlg32 -lgdi32 -Wl,--allow-multiple-definition +#cgo darwin,amd64 LDFLAGS: -L${SRCDIR}/libs -lmupdf_darwin_amd64 -lmupdfthird_darwin_amd64 -lm +#cgo darwin,arm64 LDFLAGS: -L${SRCDIR}/libs -lmupdf_darwin_arm64 -lmupdfthird_darwin_arm64 -lm +*/ +import "C" diff --git a/fitz_cgo_extlib.go b/fitz_cgo_extlib.go new file mode 100644 index 0000000..c7d5f7e --- /dev/null +++ b/fitz_cgo_extlib.go @@ -0,0 +1,11 @@ +//go:build extlib && !pkgconfig + +package fitz + +/* +#cgo !static LDFLAGS: -lmupdf -lm +#cgo static LDFLAGS: -lmupdf -lm -lmupdf-third +#cgo android LDFLAGS: -llog +#cgo windows LDFLAGS: -lcomdlg32 -lgdi32 +*/ +import "C" diff --git a/fitz_cgo_extlib_pkgconfig.go b/fitz_cgo_extlib_pkgconfig.go new file mode 100644 index 0000000..ecf5145 --- /dev/null +++ b/fitz_cgo_extlib_pkgconfig.go @@ -0,0 +1,8 @@ +//go:build extlib && pkgconfig + +package fitz + +/* +#cgo pkg-config: mupdf +*/ +import "C" diff --git a/fitz_content_types.go b/fitz_content_types.go new file mode 100644 index 0000000..a4f54fc --- /dev/null +++ b/fitz_content_types.go @@ -0,0 +1,171 @@ +package fitz + +// contentType returns document MIME type. +func contentType(b []byte) string { + l := len(b) + // for file length shortcuts see https://github.com/mathiasbynens/small + switch { + case l < 8: + return "" + case isPAM(b): + return "image/x-portable-arbitrarymap" + case isPBM(b): + return "image/x-portable-bitmap" + case isPFM(b): + return "image/x-portable-floatmap" + case isPGM(b): + return "image/x-portable-greymap" + case isPPM(b): + return "image/x-portable-pixmap" + case isGIF(b): + return "image/gif" + case l < 16: + return "" + case isBMP(b): + return "image/bmp" + case isJBIG2(b): + // file header + segment header = 24 bytes + return "image/x-jb2" + case l < 32: + return "" + case isTIFF(b): + return "image/tiff" + case l < 64: + return "" + case isJPEG(b): + return "image/jpeg" + case isPNG(b): + return "image/png" + case isJPEG2000(b): + return "image/jp2" + case isJPEGXR(b): + return "image/vnd.ms-photo" + case isPDF(b): + return "application/pdf" + case isZIP(b): + switch { + case isEPUB(b): + return "application/epub+zip" + case isXPS(b): + return "application/oxps" + default: + // fitz will consider it a Comic Book Archive + // must contain at least one image, i.e. >64 bytes + return "application/zip" + } + case isXML(b): + // fitz will consider it an FB2 + // minimal valid FB2 w/o content is >64 bytes + return "text/xml" + default: + return "" + } +} + +func isBMP(b []byte) bool { + return b[0] == 0x42 && b[1] == 0x4D +} + +func isGIF(b []byte) bool { + return b[0] == 0x47 && b[1] == 0x49 && b[2] == 0x46 && b[3] == 0x38 +} + +func isJBIG2(b []byte) bool { + return b[0] == 0x97 && b[1] == 0x4A && b[2] == 0x42 && b[3] == 0x32 && + b[4] == 0x0D && b[5] == 0x0A && b[6] == 0x1A && b[7] == 0x0A +} + +func isJPEG(b []byte) bool { + return b[0] == 0xFF && b[1] == 0xD8 && b[2] == 0xFF +} + +func isJPEG2000(b []byte) bool { + switch { + case b[0] == 0xFF && b[1] == 0x4F && b[2] == 0xFF && b[3] == 0x51: + return true + default: + return b[0] == 0x00 && b[1] == 0x00 && b[2] == 0x00 && b[3] == 0x0C && + b[4] == 0x6A && b[5] == 0x50 && b[6] == 0x20 && b[7] == 0x20 && + b[8] == 0x0D && b[9] == 0x0A && b[10] == 0x87 && b[11] == 0x0A + } +} + +func isJPEGXR(b []byte) bool { + return b[0] == 0x49 && b[1] == 0x49 && b[2] == 0xBC +} + +func isPAM(b []byte) bool { + return b[0] == 0x50 && b[1] == 0x37 && b[2] == 0x0A +} + +func isPBM(b []byte) bool { + return b[0] == 0x50 && (b[1] == 0x31 || b[1] == 0x34) && b[2] == 0x0A +} + +func isPFM(b []byte) bool { + return b[0] == 0x50 && (b[1] == 0x46 || b[1] == 0x66) && b[2] == 0x0A +} + +func isPGM(b []byte) bool { + return b[0] == 0x50 && (b[1] == 0x32 || b[1] == 0x35) && b[2] == 0x0A +} + +func isPPM(b []byte) bool { + return b[0] == 0x50 && (b[1] == 0x33 || b[1] == 0x36) && b[2] == 0x0A +} + +func isPNG(b []byte) bool { + return b[0] == 0x89 && b[1] == 0x50 && b[2] == 0x4E && b[3] == 0x47 && + b[4] == 0x0D && b[5] == 0x0A && b[6] == 0x1A && b[7] == 0x0A +} + +func isTIFF(b []byte) bool { + return b[0] == 0x49 && b[1] == 0x49 && b[2] == 0x2A && b[3] == 0x00 || + b[0] == 0x4D && b[1] == 0x4D && b[2] == 0x00 && b[3] == 0x2A +} + +// PDF magic number 25 50 44 46 = "%PDF". +func isPDF(b []byte) bool { + return b[0] == 0x25 && b[1] == 0x50 && b[2] == 0x44 && b[3] == 0x46 +} + +// Non-empty ZIP archive magic number 50 4B 03 04. +func isZIP(b []byte) bool { + return b[0] == 0x50 && b[1] == 0x4B && b[2] == 0x03 && b[3] == 0x04 +} + +// Looks for a file named "mimetype" containing the ASCII string "application/epub+zip". +// The file must be uncompressed and be the first file within the archive. +func isEPUB(b []byte) bool { + return b[30] == 0x6D && b[31] == 0x69 && b[32] == 0x6D && b[33] == 0x65 && + b[34] == 0x74 && b[35] == 0x79 && b[36] == 0x70 && b[37] == 0x65 && + b[38] == 0x61 && b[39] == 0x70 && b[40] == 0x70 && b[41] == 0x6C && + b[42] == 0x69 && b[43] == 0x63 && b[44] == 0x61 && b[45] == 0x74 && + b[46] == 0x69 && b[47] == 0x6F && b[48] == 0x6E && b[49] == 0x2F && + b[50] == 0x65 && b[51] == 0x70 && b[52] == 0x75 && b[53] == 0x62 && + b[54] == 0x2B && b[55] == 0x7A && b[56] == 0x69 && b[57] == 0x70 +} + +// Looks for a file named "[Content_Types].xml" at the root of a ZIP archive. +// MS Office apps put this file first within the archive enabling for fast detection. +func isXPS(b []byte) bool { + return b[30] == 0x5B && b[31] == 0x43 && b[32] == 0x6F && b[33] == 0x6E && + b[34] == 0x74 && b[35] == 0x65 && b[36] == 0x6E && b[37] == 0x74 && + b[38] == 0x5F && b[39] == 0x54 && b[40] == 0x79 && b[41] == 0x70 && + b[42] == 0x65 && b[43] == 0x73 && b[44] == 0x5D && b[45] == 0x2E && + b[46] == 0x78 && b[47] == 0x6D && b[48] == 0x6C +} + +// Checks for " +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUDPF_FITZ_H +#define MUDPF_FITZ_H + +#ifdef __cplusplus +extern "C" { +#endif + +#include "mupdf/fitz/version.h" +#include "mupdf/fitz/config.h" +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/output.h" +#include "mupdf/fitz/log.h" + +#include "mupdf/fitz/crypt.h" +#include "mupdf/fitz/getopt.h" +#include "mupdf/fitz/geometry.h" +#include "mupdf/fitz/hash.h" +#include "mupdf/fitz/pool.h" +#include "mupdf/fitz/string-util.h" +#include "mupdf/fitz/tree.h" +#include "mupdf/fitz/bidi.h" +#include "mupdf/fitz/xml.h" + +/* I/O */ +#include "mupdf/fitz/buffer.h" +#include "mupdf/fitz/stream.h" +#include "mupdf/fitz/compress.h" +#include "mupdf/fitz/compressed-buffer.h" +#include "mupdf/fitz/filter.h" +#include "mupdf/fitz/archive.h" + +/* Resources */ +#include "mupdf/fitz/store.h" +#include "mupdf/fitz/color.h" +#include "mupdf/fitz/pixmap.h" +#include "mupdf/fitz/bitmap.h" +#include "mupdf/fitz/image.h" +#include "mupdf/fitz/shade.h" +#include "mupdf/fitz/font.h" +#include "mupdf/fitz/path.h" +#include "mupdf/fitz/text.h" +#include "mupdf/fitz/separation.h" +#include "mupdf/fitz/glyph.h" + +#include "mupdf/fitz/device.h" +#include "mupdf/fitz/display-list.h" +#include "mupdf/fitz/structured-text.h" + +#include "mupdf/fitz/transition.h" +#include "mupdf/fitz/glyph-cache.h" + +/* Document */ +#include "mupdf/fitz/link.h" +#include "mupdf/fitz/outline.h" +#include "mupdf/fitz/document.h" + +#include "mupdf/fitz/util.h" + +/* Output formats */ +#include "mupdf/fitz/writer.h" +#include "mupdf/fitz/band-writer.h" +#include "mupdf/fitz/write-pixmap.h" +#include "mupdf/fitz/output-svg.h" + +#include "mupdf/fitz/story.h" +#include "mupdf/fitz/story-writer.h" + +#ifdef __cplusplus +} +#endif + +#endif diff --git a/include/mupdf/fitz/archive.h b/include/mupdf/fitz/archive.h new file mode 100644 index 0000000..19571cd --- /dev/null +++ b/include/mupdf/fitz/archive.h @@ -0,0 +1,373 @@ +// Copyright (C) 2004-2022 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_ARCHIVE_H +#define MUPDF_FITZ_ARCHIVE_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/buffer.h" +#include "mupdf/fitz/stream.h" +#include "mupdf/fitz/tree.h" + +/* PUBLIC API */ + +/** + fz_archive: + + fz_archive provides methods for accessing "archive" files. + An archive file is a conceptual entity that contains multiple + files, which can be counted, enumerated, and read. + + Implementations of fz_archive based upon directories, zip + and tar files are included. +*/ + +typedef struct fz_archive fz_archive; + +/** + Open a zip or tar archive + + Open a file and identify its archive type based on the archive + signature contained inside. + + filename: a path to a file as it would be given to open(2). +*/ +fz_archive *fz_open_archive(fz_context *ctx, const char *filename); + +/** + Open zip or tar archive stream. + + Open an archive using a seekable stream object rather than + opening a file or directory on disk. +*/ +fz_archive *fz_open_archive_with_stream(fz_context *ctx, fz_stream *file); + +/** + Open zip or tar archive stream. + + Does the same as fz_open_archive_with_stream, but will not throw + an error in the event of failing to recognise the format. Will + still throw errors in other cases though! +*/ +fz_archive *fz_try_open_archive_with_stream(fz_context *ctx, fz_stream *file); + +/** + Open a directory as if it was an archive. + + A special case where a directory is opened as if it was an + archive. + + Note that for directories it is not possible to retrieve the + number of entries or list the entries. It is however possible + to check if the archive has a particular entry. + + path: a path to a directory as it would be given to opendir(3). +*/ +fz_archive *fz_open_directory(fz_context *ctx, const char *path); + + +/** + Determine if a given path is a directory. +*/ +int fz_is_directory(fz_context *ctx, const char *path); + +/** + Drop a reference to an archive. + + When the last reference is dropped, this closes and releases + any memory or filehandles associated with the archive. +*/ +void fz_drop_archive(fz_context *ctx, fz_archive *arch); + +/** + Keep a reference to an archive. +*/ +fz_archive * +fz_keep_archive(fz_context *ctx, fz_archive *arch); + +/** + Return a pointer to a string describing the format of the + archive. + + The lifetime of the string is unspecified (in current + implementations the string will persist until the archive + is closed, but this is not guaranteed). +*/ +const char *fz_archive_format(fz_context *ctx, fz_archive *arch); + +/** + Number of entries in archive. + + Will always return a value >= 0. + + May throw an exception if this type of archive cannot count the + entries (such as a directory). +*/ +int fz_count_archive_entries(fz_context *ctx, fz_archive *arch); + +/** + Get listed name of entry position idx. + + idx: Must be a value >= 0 < return value from + fz_count_archive_entries. If not in range NULL will be + returned. + + May throw an exception if this type of archive cannot list the + entries (such as a directory). +*/ +const char *fz_list_archive_entry(fz_context *ctx, fz_archive *arch, int idx); + +/** + Check if entry by given name exists. + + If named entry does not exist 0 will be returned, if it does + exist 1 is returned. + + name: Entry name to look for, this must be an exact match to + the entry name in the archive. +*/ +int fz_has_archive_entry(fz_context *ctx, fz_archive *arch, const char *name); + +/** + Opens an archive entry as a stream. + + name: Entry name to look for, this must be an exact match to + the entry name in the archive. + + Throws an exception if a matching entry cannot be found. +*/ +fz_stream *fz_open_archive_entry(fz_context *ctx, fz_archive *arch, const char *name); + +/** + Opens an archive entry as a stream. + + Returns NULL if a matching entry cannot be found, otherwise + behaves exactly as fz_open_archive_entry. +*/ +fz_stream *fz_try_open_archive_entry(fz_context *ctx, fz_archive *arch, const char *name); + +/** + Reads all bytes in an archive entry + into a buffer. + + name: Entry name to look for, this must be an exact match to + the entry name in the archive. + + Throws an exception if a matching entry cannot be found. +*/ +fz_buffer *fz_read_archive_entry(fz_context *ctx, fz_archive *arch, const char *name); + +/** + Reads all bytes in an archive entry + into a buffer. + + name: Entry name to look for, this must be an exact match to + the entry name in the archive. + + Returns NULL if a matching entry cannot be found. Otherwise behaves + the same as fz_read_archive_entry. Exceptions may be thrown. +*/ +fz_buffer *fz_try_read_archive_entry(fz_context *ctx, fz_archive *arch, const char *name); + +/** + fz_archive: tar implementation +*/ + +/** + Detect if stream object is a tar achieve. + + Assumes that the stream object is seekable. +*/ +int fz_is_tar_archive(fz_context *ctx, fz_stream *file); + +/** + Open a tar archive file. + + An exception is throw if the file is not a tar archive as + indicated by the presence of a tar signature. + + filename: a path to a tar archive file as it would be given to + open(2). +*/ +fz_archive *fz_open_tar_archive(fz_context *ctx, const char *filename); + +/** + Open a tar archive stream. + + Open an archive using a seekable stream object rather than + opening a file or directory on disk. + + An exception is throw if the stream is not a tar archive as + indicated by the presence of a tar signature. + +*/ +fz_archive *fz_open_tar_archive_with_stream(fz_context *ctx, fz_stream *file); + +/** + fz_archive: zip implementation +*/ + +/** + Detect if stream object is a zip archive. + + Assumes that the stream object is seekable. +*/ +int fz_is_zip_archive(fz_context *ctx, fz_stream *file); + +/** + Open a zip archive file. + + An exception is throw if the file is not a zip archive as + indicated by the presence of a zip signature. + + filename: a path to a zip archive file as it would be given to + open(2). +*/ +fz_archive *fz_open_zip_archive(fz_context *ctx, const char *path); + +/** + Open a zip archive stream. + + Open an archive using a seekable stream object rather than + opening a file or directory on disk. + + An exception is throw if the stream is not a zip archive as + indicated by the presence of a zip signature. + +*/ +fz_archive *fz_open_zip_archive_with_stream(fz_context *ctx, fz_stream *file); + +/** + fz_zip_writer offers methods for creating and writing zip files. + It can be seen as the reverse of the fz_archive zip + implementation. +*/ + +typedef struct fz_zip_writer fz_zip_writer; + +/** + Create a new zip writer that writes to a given file. + + Open an archive using a seekable stream object rather than + opening a file or directory on disk. +*/ +fz_zip_writer *fz_new_zip_writer(fz_context *ctx, const char *filename); + +/** + Create a new zip writer that writes to a given output stream. + + Ownership of out passes in immediately upon calling this function. + The caller should never drop the fz_output, even if this function throws + an exception. +*/ +fz_zip_writer *fz_new_zip_writer_with_output(fz_context *ctx, fz_output *out); + + +/** + Given a buffer of data, (optionally) compress it, and add it to + the zip file with the given name. +*/ +void fz_write_zip_entry(fz_context *ctx, fz_zip_writer *zip, const char *name, fz_buffer *buf, int compress); + +/** + Close the zip file for writing. + + This flushes any pending data to the file. This can throw + exceptions. +*/ +void fz_close_zip_writer(fz_context *ctx, fz_zip_writer *zip); + +/** + Drop the reference to the zipfile. + + In common with other 'drop' methods, this will never throw an + exception. +*/ +void fz_drop_zip_writer(fz_context *ctx, fz_zip_writer *zip); + +/** + Create an archive that holds named buffers. + + tree can either be a preformed tree with fz_buffers as values, + or it can be NULL for an empty tree. +*/ +fz_archive *fz_new_tree_archive(fz_context *ctx, fz_tree *tree); + +/** + Add a named buffer to an existing tree archive. + + The tree will take a new reference to the buffer. Ownership + is not transferred. +*/ +void fz_tree_archive_add_buffer(fz_context *ctx, fz_archive *arch_, const char *name, fz_buffer *buf); + +/** + Add a named block of data to an existing tree archive. + + The data will be copied into a buffer, and so the caller + may free it as soon as this returns. +*/ +void fz_tree_archive_add_data(fz_context *ctx, fz_archive *arch_, const char *name, const void *data, size_t size); + +/** + Create a new multi archive (initially empty). +*/ +fz_archive *fz_new_multi_archive(fz_context *ctx); + +/** + Add an archive to the set of archives handled by a multi + archive. + + If path is NULL, then the archive contents will appear at the + top level, otherwise, the archives contents will appear prefixed + by path. +*/ +void fz_mount_multi_archive(fz_context *ctx, fz_archive *arch_, fz_archive *sub, const char *path); + +/** + Implementation details: Subject to change. +*/ + +struct fz_archive +{ + int refs; + + fz_stream *file; + const char *format; + + void (*drop_archive)(fz_context *ctx, fz_archive *arch); + int (*count_entries)(fz_context *ctx, fz_archive *arch); + const char *(*list_entry)(fz_context *ctx, fz_archive *arch, int idx); + int (*has_entry)(fz_context *ctx, fz_archive *arch, const char *name); + fz_buffer *(*read_entry)(fz_context *ctx, fz_archive *arch, const char *name); + fz_stream *(*open_entry)(fz_context *ctx, fz_archive *arch, const char *name); +}; + +fz_archive *fz_new_archive_of_size(fz_context *ctx, fz_stream *file, int size); + +#define fz_new_derived_archive(C,F,M) \ + ((M*)Memento_label(fz_new_archive_of_size(C, F, sizeof(M)), #M)) + + + +#endif diff --git a/include/mupdf/fitz/band-writer.h b/include/mupdf/fitz/band-writer.h new file mode 100644 index 0000000..853307b --- /dev/null +++ b/include/mupdf/fitz/band-writer.h @@ -0,0 +1,117 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_BAND_WRITER_H +#define MUPDF_FITZ_BAND_WRITER_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/output.h" +#include "mupdf/fitz/color.h" +#include "mupdf/fitz/separation.h" + +/** + fz_band_writer +*/ +typedef struct fz_band_writer fz_band_writer; + +/** + Cause a band writer to write the header for + a banded image with the given properties/dimensions etc. This + also configures the bandwriter for the format of the data to be + passed in future calls. + + w, h: Width and Height of the entire page. + + n: Number of components (including spots and alphas). + + alpha: Number of alpha components. + + xres, yres: X and Y resolutions in dpi. + + cs: Colorspace (NULL for bitmaps) + + seps: Separation details (or NULL). +*/ +void fz_write_header(fz_context *ctx, fz_band_writer *writer, int w, int h, int n, int alpha, int xres, int yres, int pagenum, fz_colorspace *cs, fz_separations *seps); + +/** + Cause a band writer to write the next band + of data for an image. + + stride: The byte offset from the first byte of the data + for a pixel to the first byte of the data for the same pixel + on the row below. + + band_height: The number of lines in this band. + + samples: Pointer to first byte of the data. +*/ +void fz_write_band(fz_context *ctx, fz_band_writer *writer, int stride, int band_height, const unsigned char *samples); + +/** + Finishes up the output and closes the band writer. After this + call no more headers or bands may be written. +*/ +void fz_close_band_writer(fz_context *ctx, fz_band_writer *writer); + +/** + Drop the reference to the band writer, causing it to be + destroyed. + + Never throws an exception. +*/ +void fz_drop_band_writer(fz_context *ctx, fz_band_writer *writer); + +/* Implementation details: subject to change. */ + +typedef void (fz_write_header_fn)(fz_context *ctx, fz_band_writer *writer, fz_colorspace *cs); +typedef void (fz_write_band_fn)(fz_context *ctx, fz_band_writer *writer, int stride, int band_start, int band_height, const unsigned char *samples); +typedef void (fz_write_trailer_fn)(fz_context *ctx, fz_band_writer *writer); +typedef void (fz_close_band_writer_fn)(fz_context *ctx, fz_band_writer *writer); +typedef void (fz_drop_band_writer_fn)(fz_context *ctx, fz_band_writer *writer); + +struct fz_band_writer +{ + fz_drop_band_writer_fn *drop; + fz_close_band_writer_fn *close; + fz_write_header_fn *header; + fz_write_band_fn *band; + fz_write_trailer_fn *trailer; + fz_output *out; + int w; + int h; + int n; + int s; + int alpha; + int xres; + int yres; + int pagenum; + int line; + fz_separations *seps; +}; + +fz_band_writer *fz_new_band_writer_of_size(fz_context *ctx, size_t size, fz_output *out); +#define fz_new_band_writer(C,M,O) ((M *)Memento_label(fz_new_band_writer_of_size(ctx, sizeof(M), O), #M)) + + +#endif diff --git a/include/mupdf/fitz/bidi.h b/include/mupdf/fitz/bidi.h new file mode 100644 index 0000000..3aa2dbb --- /dev/null +++ b/include/mupdf/fitz/bidi.h @@ -0,0 +1,90 @@ +/** + Bidirectional text processing. + + Derived from the SmartOffice code, which is itself derived + from the example unicode standard code. Original copyright + messages follow: + + Copyright (C) Picsel, 2004-2008. All Rights Reserved. + + Processes Unicode text by arranging the characters into an order + suitable for display. E.g. Hebrew text will be arranged from + right-to-left and any English within the text will remain in the + left-to-right order. + + This is an implementation of the Unicode Bidirectional Algorithm + which can be found here: http://www.unicode.org/reports/tr9/ and + is based on the reference implementation found on Unicode.org. +*/ + +#ifndef FITZ_BIDI_H +#define FITZ_BIDI_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" + +/* Implementation details: subject to change. */ + +typedef enum +{ + FZ_BIDI_LTR = 0, + FZ_BIDI_RTL = 1, + FZ_BIDI_NEUTRAL = 2 +} +fz_bidi_direction; + +typedef enum +{ + FZ_BIDI_CLASSIFY_WHITE_SPACE = 1, + FZ_BIDI_REPLACE_TAB = 2 +} +fz_bidi_flags; + +/** + Prototype for callback function supplied to fz_bidi_fragment_text. + + @param fragment first character in fragment + @param fragmentLen number of characters in fragment + @param bidiLevel The bidirectional level for this text. + The bottom bit will be set iff block + should concatenate with other blocks as + right-to-left + @param script the script in use for this fragment (other + than common or inherited) + @param arg data from caller of Bidi_fragmentText +*/ +typedef void (fz_bidi_fragment_fn)(const uint32_t *fragment, + size_t fragmentLen, + int bidiLevel, + int script, + void *arg); + +/** + Partitions the given Unicode sequence into one or more + unidirectional fragments and invokes the given callback + function for each fragment. + + For example, if directionality of text is: + 0123456789 + rrlllrrrrr, + we'll invoke callback with: + &text[0], length == 2 + &text[2], length == 3 + &text[5], length == 5 + + @param[in] text start of Unicode sequence + @param[in] textlen number of Unicodes to analyse + @param[in] baseDir direction of paragraph (specify FZ_BIDI_NEUTRAL to force auto-detection) + @param[in] callback function to be called for each fragment + @param[in] arg data to be passed to the callback function + @param[in] flags flags to control operation (see fz_bidi_flags above) +*/ +void fz_bidi_fragment_text(fz_context *ctx, + const uint32_t *text, + size_t textlen, + fz_bidi_direction *baseDir, + fz_bidi_fragment_fn *callback, + void *arg, + int flags); + +#endif diff --git a/include/mupdf/fitz/bitmap.h b/include/mupdf/fitz/bitmap.h new file mode 100644 index 0000000..20175de --- /dev/null +++ b/include/mupdf/fitz/bitmap.h @@ -0,0 +1,168 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_BITMAP_H +#define MUPDF_FITZ_BITMAP_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/pixmap.h" + +/** + Bitmaps have 1 bit per component. Only used for creating + halftoned versions of contone buffers, and saving out. Samples + are stored msb first, akin to pbms. + + The internals of this struct are considered implementation + details and subject to change. Where possible, accessor + functions should be used in preference. +*/ +typedef struct +{ + int refs; + int w, h, stride, n; + int xres, yres; + unsigned char *samples; +} fz_bitmap; + +/** + Take an additional reference to the bitmap. The same pointer + is returned. + + Never throws exceptions. +*/ +fz_bitmap *fz_keep_bitmap(fz_context *ctx, fz_bitmap *bit); + +/** + Drop a reference to the bitmap. When the reference count reaches + zero, the bitmap will be destroyed. + + Never throws exceptions. +*/ +void fz_drop_bitmap(fz_context *ctx, fz_bitmap *bit); + +/** + A halftone is a set of threshold tiles, one per component. Each + threshold tile is a pixmap, possibly of varying sizes and + phases. Currently, we only provide one 'default' halftone tile + for operating on 1 component plus alpha pixmaps (where the alpha + is ignored). This is signified by a fz_halftone pointer to NULL. +*/ +typedef struct fz_halftone fz_halftone; + +/** + Make a bitmap from a pixmap and a halftone. + + pix: The pixmap to generate from. Currently must be a single + color component with no alpha. + + ht: The halftone to use. NULL implies the default halftone. + + Returns the resultant bitmap. Throws exceptions in the case of + failure to allocate. +*/ +fz_bitmap *fz_new_bitmap_from_pixmap(fz_context *ctx, fz_pixmap *pix, fz_halftone *ht); + +/** + Make a bitmap from a pixmap and a + halftone, allowing for the position of the pixmap within an + overall banded rendering. + + pix: The pixmap to generate from. Currently must be a single + color component with no alpha. + + ht: The halftone to use. NULL implies the default halftone. + + band_start: Vertical offset within the overall banded rendering + (in pixels) + + Returns the resultant bitmap. Throws exceptions in the case of + failure to allocate. +*/ +fz_bitmap *fz_new_bitmap_from_pixmap_band(fz_context *ctx, fz_pixmap *pix, fz_halftone *ht, int band_start); + +/** + Create a new bitmap. + + w, h: Width and Height for the bitmap + + n: Number of color components (assumed to be a divisor of 8) + + xres, yres: X and Y resolutions (in pixels per inch). + + Returns pointer to created bitmap structure. The bitmap + data is uninitialised. +*/ +fz_bitmap *fz_new_bitmap(fz_context *ctx, int w, int h, int n, int xres, int yres); + +/** + Retrieve details of a given bitmap. + + bitmap: The bitmap to query. + + w: Pointer to storage to retrieve width (or NULL). + + h: Pointer to storage to retrieve height (or NULL). + + n: Pointer to storage to retrieve number of color components (or + NULL). + + stride: Pointer to storage to retrieve bitmap stride (or NULL). +*/ +void fz_bitmap_details(fz_bitmap *bitmap, int *w, int *h, int *n, int *stride); + +/** + Set the entire bitmap to 0. + + Never throws exceptions. +*/ +void fz_clear_bitmap(fz_context *ctx, fz_bitmap *bit); + +/** + Create a 'default' halftone structure + for the given number of components. + + num_comps: The number of components to use. + + Returns a simple default halftone. The default halftone uses + the same halftone tile for each plane, which may not be ideal + for all purposes. +*/ +fz_halftone *fz_default_halftone(fz_context *ctx, int num_comps); + +/** + Take an additional reference to the halftone. The same pointer + is returned. + + Never throws exceptions. +*/ +fz_halftone *fz_keep_halftone(fz_context *ctx, fz_halftone *half); + +/** + Drop a reference to the halftone. When the reference count + reaches zero, the halftone is destroyed. + + Never throws exceptions. +*/ +void fz_drop_halftone(fz_context *ctx, fz_halftone *ht); + +#endif diff --git a/include/mupdf/fitz/buffer.h b/include/mupdf/fitz/buffer.h new file mode 100644 index 0000000..5ac949c --- /dev/null +++ b/include/mupdf/fitz/buffer.h @@ -0,0 +1,250 @@ +// Copyright (C) 2004-2023 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_BUFFER_H +#define MUPDF_FITZ_BUFFER_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" + +/** + fz_buffer is a wrapper around a dynamically allocated array of + bytes. + + Buffers have a capacity (the number of bytes storage immediately + available) and a current size. + + The contents of the structure are considered implementation + details and are subject to change. Users should use the accessor + functions in preference. +*/ +typedef struct +{ + int refs; + unsigned char *data; + size_t cap, len; + int unused_bits; + int shared; +} fz_buffer; + +/** + Take an additional reference to the buffer. The same pointer + is returned. + + Never throws exceptions. +*/ +fz_buffer *fz_keep_buffer(fz_context *ctx, fz_buffer *buf); + +/** + Drop a reference to the buffer. When the reference count reaches + zero, the buffer is destroyed. + + Never throws exceptions. +*/ +void fz_drop_buffer(fz_context *ctx, fz_buffer *buf); + +/** + Retrieve internal memory of buffer. + + datap: Output parameter that will be pointed to the data. + + Returns the current size of the data in bytes. +*/ +size_t fz_buffer_storage(fz_context *ctx, fz_buffer *buf, unsigned char **datap); + +/** + Ensure that a buffer's data ends in a + 0 byte, and return a pointer to it. +*/ +const char *fz_string_from_buffer(fz_context *ctx, fz_buffer *buf); + +fz_buffer *fz_new_buffer(fz_context *ctx, size_t capacity); + +/** + Create a new buffer with existing data. + + data: Pointer to existing data. + size: Size of existing data. + + Takes ownership of data. Does not make a copy. Calls fz_free on + the data when the buffer is deallocated. Do not use 'data' after + passing to this function. + + Returns pointer to new buffer. Throws exception on allocation + failure. +*/ +fz_buffer *fz_new_buffer_from_data(fz_context *ctx, unsigned char *data, size_t size); + +/** + Like fz_new_buffer, but does not take ownership. +*/ +fz_buffer *fz_new_buffer_from_shared_data(fz_context *ctx, const unsigned char *data, size_t size); + +/** + Create a new buffer containing a copy of the passed data. +*/ +fz_buffer *fz_new_buffer_from_copied_data(fz_context *ctx, const unsigned char *data, size_t size); + +/** + Make a new buffer, containing a copy of the data used in + the original. +*/ +fz_buffer *fz_clone_buffer(fz_context *ctx, fz_buffer *buf); + +/** + Create a new buffer with data decoded from a base64 input string. +*/ +fz_buffer *fz_new_buffer_from_base64(fz_context *ctx, const char *data, size_t size); + +/** + Ensure that a buffer has a given capacity, + truncating data if required. + + capacity: The desired capacity for the buffer. If the current + size of the buffer contents is smaller than capacity, it is + truncated. +*/ +void fz_resize_buffer(fz_context *ctx, fz_buffer *buf, size_t capacity); + +/** + Make some space within a buffer (i.e. ensure that + capacity > size). +*/ +void fz_grow_buffer(fz_context *ctx, fz_buffer *buf); + +/** + Trim wasted capacity from a buffer by resizing internal memory. +*/ +void fz_trim_buffer(fz_context *ctx, fz_buffer *buf); + +/** + Empties the buffer. Storage is not freed, but is held ready + to be reused as the buffer is refilled. + + Never throws exceptions. +*/ +void fz_clear_buffer(fz_context *ctx, fz_buffer *buf); + +/** + Create a new buffer with a (subset of) the data from the buffer. + + start: if >= 0, offset from start of buffer, if < 0 offset from end of buffer. + + end: if >= 0, offset from start of buffer, if < 0 offset from end of buffer. + +*/ +fz_buffer *fz_slice_buffer(fz_context *ctx, fz_buffer *buf, int64_t start, int64_t end); + +/** + Append the contents of the source buffer onto the end of the + destination buffer, extending automatically as required. + + Ownership of buffers does not change. +*/ +void fz_append_buffer(fz_context *ctx, fz_buffer *destination, fz_buffer *source); + +/** + Write a base64 encoded data block, optionally with periodic newlines. +*/ +void fz_append_base64(fz_context *ctx, fz_buffer *out, const unsigned char *data, size_t size, int newline); + +/** + Append a base64 encoded fz_buffer, optionally with periodic newlines. +*/ +void fz_append_base64_buffer(fz_context *ctx, fz_buffer *out, fz_buffer *data, int newline); + +/** + fz_append_*: Append data to a buffer. + + The buffer will automatically grow as required. +*/ +void fz_append_data(fz_context *ctx, fz_buffer *buf, const void *data, size_t len); +void fz_append_string(fz_context *ctx, fz_buffer *buf, const char *data); +void fz_append_byte(fz_context *ctx, fz_buffer *buf, int c); +void fz_append_rune(fz_context *ctx, fz_buffer *buf, int c); +void fz_append_int32_le(fz_context *ctx, fz_buffer *buf, int x); +void fz_append_int16_le(fz_context *ctx, fz_buffer *buf, int x); +void fz_append_int32_be(fz_context *ctx, fz_buffer *buf, int x); +void fz_append_int16_be(fz_context *ctx, fz_buffer *buf, int x); +void fz_append_bits(fz_context *ctx, fz_buffer *buf, int value, int count); +void fz_append_bits_pad(fz_context *ctx, fz_buffer *buf); + +/** + fz_append_pdf_string: Append a string with PDF syntax quotes and + escapes. + + The buffer will automatically grow as required. +*/ +void fz_append_pdf_string(fz_context *ctx, fz_buffer *buffer, const char *text); + +/** + fz_append_printf: Format and append data to buffer using + printf-like formatting (see fz_vsnprintf). + + The buffer will automatically grow as required. +*/ +void fz_append_printf(fz_context *ctx, fz_buffer *buffer, const char *fmt, ...); + +/** + fz_append_vprintf: Format and append data to buffer using + printf-like formatting with varargs (see fz_vsnprintf). +*/ +void fz_append_vprintf(fz_context *ctx, fz_buffer *buffer, const char *fmt, va_list args); + +/** + Zero-terminate buffer in order to use as a C string. + + This byte is invisible and does not affect the length of the + buffer as returned by fz_buffer_storage. The zero byte is + written *after* the data, and subsequent writes will overwrite + the terminating byte. + + Subsequent changes to the size of the buffer (such as by + fz_buffer_trim, fz_buffer_grow, fz_resize_buffer, etc) may + invalidate this. +*/ +void fz_terminate_buffer(fz_context *ctx, fz_buffer *buf); + +/** + Create an MD5 digest from buffer contents. + + Never throws exceptions. +*/ +void fz_md5_buffer(fz_context *ctx, fz_buffer *buffer, unsigned char digest[16]); + +/** + Take ownership of buffer contents. + + Performs the same task as fz_buffer_storage, but ownership of + the data buffer returns with this call. The buffer is left + empty. + + Note: Bad things may happen if this is called on a buffer with + multiple references that is being used from multiple threads. + + data: Pointer to place to retrieve data pointer. + + Returns length of stream. +*/ +size_t fz_buffer_extract(fz_context *ctx, fz_buffer *buf, unsigned char **data); + +#endif diff --git a/include/mupdf/fitz/color.h b/include/mupdf/fitz/color.h new file mode 100644 index 0000000..0a7e985 --- /dev/null +++ b/include/mupdf/fitz/color.h @@ -0,0 +1,427 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_COLOR_H +#define MUPDF_FITZ_COLOR_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/store.h" + +#if FZ_ENABLE_ICC +/** + Opaque type for an ICC Profile. +*/ +typedef struct fz_icc_profile fz_icc_profile; +#endif + +/** + Describes a given colorspace. +*/ +typedef struct fz_colorspace fz_colorspace; + +/** + Pixmaps represent a set of pixels for a 2 dimensional region of + a plane. Each pixel has n components per pixel. The components + are in the order process-components, spot-colors, alpha, where + there can be 0 of any of those types. The data is in + premultiplied alpha when rendering, but non-premultiplied for + colorspace conversions and rescaling. +*/ +typedef struct fz_pixmap fz_pixmap; + +/* Color handling parameters: rendering intent, overprint, etc. */ + +enum +{ + /* Same order as needed by lcms */ + FZ_RI_PERCEPTUAL, + FZ_RI_RELATIVE_COLORIMETRIC, + FZ_RI_SATURATION, + FZ_RI_ABSOLUTE_COLORIMETRIC, +}; + +typedef struct +{ + uint8_t ri; /* rendering intent */ + uint8_t bp; /* black point compensation */ + uint8_t op; /* overprinting */ + uint8_t opm; /* overprint mode */ +} fz_color_params; + +FZ_DATA extern const fz_color_params fz_default_color_params; + +/** + Map from (case sensitive) rendering intent string to enumeration + value. +*/ +int fz_lookup_rendering_intent(const char *name); + +/** + Map from enumerated rendering intent to string. + + The returned string is static and therefore must not be freed. +*/ +const char *fz_rendering_intent_name(int ri); + +/** + The maximum number of colorants available in any given + color/colorspace (not including alpha). + + Changing this value will alter the amount of memory being used + (both stack and heap space), but not hugely. Speed should + (largely) be determined by the number of colors actually used. +*/ +enum { FZ_MAX_COLORS = 32 }; + +enum fz_colorspace_type +{ + FZ_COLORSPACE_NONE, + FZ_COLORSPACE_GRAY, + FZ_COLORSPACE_RGB, + FZ_COLORSPACE_BGR, + FZ_COLORSPACE_CMYK, + FZ_COLORSPACE_LAB, + FZ_COLORSPACE_INDEXED, + FZ_COLORSPACE_SEPARATION, +}; + +enum +{ + FZ_COLORSPACE_IS_DEVICE = 1, + FZ_COLORSPACE_IS_ICC = 2, + FZ_COLORSPACE_HAS_CMYK = 4, + FZ_COLORSPACE_HAS_SPOTS = 8, + FZ_COLORSPACE_HAS_CMYK_AND_SPOTS = 4|8, +}; + +/** + Creates a new colorspace instance and returns a reference. + + No internal checking is done that the colorspace type (e.g. + CMYK) matches with the flags (e.g. FZ_COLORSPACE_HAS_CMYK) or + colorant count (n) or name. + + The reference should be dropped when it is finished with. + + Colorspaces are immutable once created (with the exception of + setting up colorant names for separation spaces). +*/ +fz_colorspace *fz_new_colorspace(fz_context *ctx, enum fz_colorspace_type type, int flags, int n, const char *name); + +/** + Increment the reference count for the colorspace. + + Returns the same pointer. Never throws an exception. +*/ +fz_colorspace *fz_keep_colorspace(fz_context *ctx, fz_colorspace *colorspace); + +/** + Drops a reference to the colorspace. + + When the reference count reaches zero, the colorspace is + destroyed. +*/ +void fz_drop_colorspace(fz_context *ctx, fz_colorspace *colorspace); + +/** + Create an indexed colorspace. + + The supplied lookup table is high palette entries long. Each + entry is n bytes long, where n is given by the number of + colorants in the base colorspace, one byte per colorant. + + Ownership of lookup is passed it; it will be freed on + destruction, so must be heap allocated. + + The colorspace will keep an additional reference to the base + colorspace that will be dropped on destruction. + + The returned reference should be dropped when it is finished + with. + + Colorspaces are immutable once created. +*/ +fz_colorspace *fz_new_indexed_colorspace(fz_context *ctx, fz_colorspace *base, int high, unsigned char *lookup); + +/** + Create a colorspace from an ICC profile supplied in buf. + + Limited checking is done to ensure that the colorspace type is + appropriate for the supplied ICC profile. + + An additional reference is taken to buf, which will be dropped + on destruction. Ownership is NOT passed in. + + The returned reference should be dropped when it is finished + with. + + Colorspaces are immutable once created. +*/ +fz_colorspace *fz_new_icc_colorspace(fz_context *ctx, enum fz_colorspace_type type, int flags, const char *name, fz_buffer *buf); + + +/** + Create a calibrated gray colorspace. + + The returned reference should be dropped when it is finished + with. + + Colorspaces are immutable once created. +*/ +fz_colorspace *fz_new_cal_gray_colorspace(fz_context *ctx, float wp[3], float bp[3], float gamma); + +/** + Create a calibrated rgb colorspace. + + The returned reference should be dropped when it is finished + with. + + Colorspaces are immutable once created. +*/ +fz_colorspace *fz_new_cal_rgb_colorspace(fz_context *ctx, float wp[3], float bp[3], float gamma[3], float matrix[9]); + +/** + Query the type of colorspace. +*/ +enum fz_colorspace_type fz_colorspace_type(fz_context *ctx, fz_colorspace *cs); + +/** + Query the name of a colorspace. + + The returned string has the same lifespan as the colorspace + does. Caller should not free it. +*/ +const char *fz_colorspace_name(fz_context *ctx, fz_colorspace *cs); + +/** + Query the number of colorants in a colorspace. +*/ +int fz_colorspace_n(fz_context *ctx, fz_colorspace *cs); + +/** + True for CMYK, Separation and DeviceN colorspaces. +*/ +int fz_colorspace_is_subtractive(fz_context *ctx, fz_colorspace *cs); + +/** + True if DeviceN color space has only colorants from the CMYK set. +*/ +int fz_colorspace_device_n_has_only_cmyk(fz_context *ctx, fz_colorspace *cs); + +/** + True if DeviceN color space has cyan magenta yellow or black as + one of its colorants. +*/ +int fz_colorspace_device_n_has_cmyk(fz_context *ctx, fz_colorspace *cs); + +/** + Tests for particular types of colorspaces +*/ +int fz_colorspace_is_gray(fz_context *ctx, fz_colorspace *cs); +int fz_colorspace_is_rgb(fz_context *ctx, fz_colorspace *cs); +int fz_colorspace_is_cmyk(fz_context *ctx, fz_colorspace *cs); +int fz_colorspace_is_lab(fz_context *ctx, fz_colorspace *cs); +int fz_colorspace_is_indexed(fz_context *ctx, fz_colorspace *cs); +int fz_colorspace_is_device_n(fz_context *ctx, fz_colorspace *cs); +int fz_colorspace_is_device(fz_context *ctx, fz_colorspace *cs); +int fz_colorspace_is_device_gray(fz_context *ctx, fz_colorspace *cs); +int fz_colorspace_is_device_cmyk(fz_context *ctx, fz_colorspace *cs); +int fz_colorspace_is_lab_icc(fz_context *ctx, fz_colorspace *cs); + +/** + Check to see that a colorspace is appropriate to be used as + a blending space (i.e. only grey, rgb or cmyk). +*/ +int fz_is_valid_blend_colorspace(fz_context *ctx, fz_colorspace *cs); + +/** + Get the 'base' colorspace for a colorspace. + + For indexed colorspaces, this is the colorspace the index + decodes into. For all other colorspaces, it is the colorspace + itself. + + The returned colorspace is 'borrowed' (i.e. no additional + references are taken or dropped). +*/ +fz_colorspace *fz_base_colorspace(fz_context *ctx, fz_colorspace *cs); + +/** + Retrieve global default colorspaces. + + These return borrowed references that should not be dropped, + unless they are kept first. +*/ +fz_colorspace *fz_device_gray(fz_context *ctx); +fz_colorspace *fz_device_rgb(fz_context *ctx); +fz_colorspace *fz_device_bgr(fz_context *ctx); +fz_colorspace *fz_device_cmyk(fz_context *ctx); +fz_colorspace *fz_device_lab(fz_context *ctx); + +/** + Assign a name for a given colorant in a colorspace. + + Used while initially setting up a colorspace. The string is + copied into local storage, so need not be retained by the + caller. +*/ +void fz_colorspace_name_colorant(fz_context *ctx, fz_colorspace *cs, int n, const char *name); + +/** + Retrieve a the name for a colorant. + + Returns a pointer with the same lifespan as the colorspace. +*/ +const char *fz_colorspace_colorant(fz_context *ctx, fz_colorspace *cs, int n); + +/* Color conversion */ + +/** + Clamp the samples in a color to the correct ranges for a + given colorspace. +*/ +void fz_clamp_color(fz_context *ctx, fz_colorspace *cs, const float *in, float *out); + +/** + Convert color values sv from colorspace ss into colorvalues dv + for colorspace ds, via an optional intervening space is, + respecting the given color_params. +*/ +void fz_convert_color(fz_context *ctx, fz_colorspace *ss, const float *sv, fz_colorspace *ds, float *dv, fz_colorspace *is, fz_color_params params); + +/* Default (fallback) colorspace handling */ + +/** + Structure to hold default colorspaces. +*/ +typedef struct +{ + int refs; + fz_colorspace *gray; + fz_colorspace *rgb; + fz_colorspace *cmyk; + fz_colorspace *oi; +} fz_default_colorspaces; + +/** + Create a new default colorspace structure with values inherited + from the context, and return a reference to it. + + These can be overridden using fz_set_default_xxxx. + + These should not be overridden while more than one caller has + the reference for fear of race conditions. + + The caller should drop this reference once finished with it. +*/ +fz_default_colorspaces *fz_new_default_colorspaces(fz_context *ctx); + +/** + Keep an additional reference to the default colorspaces + structure. + + Never throws exceptions. +*/ +fz_default_colorspaces* fz_keep_default_colorspaces(fz_context *ctx, fz_default_colorspaces *default_cs); + +/** + Drop a reference to the default colorspaces structure. When the + reference count reaches 0, the references it holds internally + to the underlying colorspaces will be dropped, and the structure + will be destroyed. + + Never throws exceptions. +*/ +void fz_drop_default_colorspaces(fz_context *ctx, fz_default_colorspaces *default_cs); + +/** + Returns a reference to a newly cloned default colorspaces + structure. + + The new clone may safely be altered without fear of race + conditions as the caller is the only reference holder. +*/ +fz_default_colorspaces *fz_clone_default_colorspaces(fz_context *ctx, fz_default_colorspaces *base); + +/** + Retrieve default colorspaces (typically page local). + + If default_cs is non NULL, the default is retrieved from there, + otherwise the global default is retrieved. + + These return borrowed references that should not be dropped, + unless they are kept first. +*/ +fz_colorspace *fz_default_gray(fz_context *ctx, const fz_default_colorspaces *default_cs); +fz_colorspace *fz_default_rgb(fz_context *ctx, const fz_default_colorspaces *default_cs); +fz_colorspace *fz_default_cmyk(fz_context *ctx, const fz_default_colorspaces *default_cs); +fz_colorspace *fz_default_output_intent(fz_context *ctx, const fz_default_colorspaces *default_cs); + +/** + Set new defaults within the default colorspace structure. + + New references are taken to the new default, and references to + the old defaults dropped. + + Never throws exceptions. +*/ +void fz_set_default_gray(fz_context *ctx, fz_default_colorspaces *default_cs, fz_colorspace *cs); +void fz_set_default_rgb(fz_context *ctx, fz_default_colorspaces *default_cs, fz_colorspace *cs); +void fz_set_default_cmyk(fz_context *ctx, fz_default_colorspaces *default_cs, fz_colorspace *cs); +void fz_set_default_output_intent(fz_context *ctx, fz_default_colorspaces *default_cs, fz_colorspace *cs); + +/* Implementation details: subject to change. */ + +struct fz_colorspace +{ + fz_key_storable key_storable; + enum fz_colorspace_type type; + int flags; + int n; + char *name; + union { +#if FZ_ENABLE_ICC + struct { + fz_buffer *buffer; + unsigned char md5[16]; + fz_icc_profile *profile; + } icc; +#endif + struct { + fz_colorspace *base; + int high; + unsigned char *lookup; + } indexed; + struct { + fz_colorspace *base; + void (*eval)(fz_context *ctx, void *tint, const float *s, int sn, float *d, int dn); + void (*drop)(fz_context *ctx, void *tint); + void *tint; + char *colorant[FZ_MAX_COLORS]; + } separation; + } u; +}; + +void fz_drop_colorspace_imp(fz_context *ctx, fz_storable *cs_); + +#endif diff --git a/include/mupdf/fitz/compress.h b/include/mupdf/fitz/compress.h new file mode 100644 index 0000000..d5907a1 --- /dev/null +++ b/include/mupdf/fitz/compress.h @@ -0,0 +1,87 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_COMPRESS_H +#define MUPDF_FITZ_COMPRESS_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/buffer.h" + +typedef enum +{ + FZ_DEFLATE_NONE = 0, + FZ_DEFLATE_BEST_SPEED = 1, + FZ_DEFLATE_BEST = 9, + FZ_DEFLATE_DEFAULT = -1 +} fz_deflate_level; + +/** + Returns the upper bound on the + size of flated data of length size. + */ +size_t fz_deflate_bound(fz_context *ctx, size_t size); + +/** + Compress source_length bytes of data starting + at source, into a buffer of length *destLen, starting at dest. + *compressed_length will be updated on exit to contain the size + actually used. + */ +void fz_deflate(fz_context *ctx, unsigned char *dest, size_t *compressed_length, const unsigned char *source, size_t source_length, fz_deflate_level level); + +/** + Compress source_length bytes of data starting + at source, into a new memory block malloced for that purpose. + *compressed_length is updated on exit to contain the size used. + Ownership of the block is returned from this function, and the + caller is therefore responsible for freeing it. The block may be + considerably larger than is actually required. The caller is + free to fz_realloc it down if it wants to. +*/ +unsigned char *fz_new_deflated_data(fz_context *ctx, size_t *compressed_length, const unsigned char *source, size_t source_length, fz_deflate_level level); + +/** + Compress the contents of a fz_buffer into a + new block malloced for that purpose. *compressed_length is + updated on exit to contain the size used. Ownership of the block + is returned from this function, and the caller is therefore + responsible for freeing it. The block may be considerably larger + than is actually required. The caller is free to fz_realloc it + down if it wants to. +*/ +unsigned char *fz_new_deflated_data_from_buffer(fz_context *ctx, size_t *compressed_length, fz_buffer *buffer, fz_deflate_level level); + +/** + Compress bitmap data as CCITT Group 3 1D fax image. + Creates a stream assuming the default PDF parameters, + except the number of columns. +*/ +fz_buffer *fz_compress_ccitt_fax_g3(fz_context *ctx, const unsigned char *data, int columns, int rows); + +/** + Compress bitmap data as CCITT Group 4 2D fax image. + Creates a stream assuming the default PDF parameters, except + K=-1 and the number of columns. +*/ +fz_buffer *fz_compress_ccitt_fax_g4(fz_context *ctx, const unsigned char *data, int columns, int rows); + +#endif diff --git a/include/mupdf/fitz/compressed-buffer.h b/include/mupdf/fitz/compressed-buffer.h new file mode 100644 index 0000000..924e0ef --- /dev/null +++ b/include/mupdf/fitz/compressed-buffer.h @@ -0,0 +1,173 @@ +// Copyright (C) 2004-2023 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_COMPRESSED_BUFFER_H +#define MUPDF_FITZ_COMPRESSED_BUFFER_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/buffer.h" +#include "mupdf/fitz/stream.h" +#include "mupdf/fitz/filter.h" + +/** + Compression parameters used for buffers of compressed data; + typically for the source data for images. +*/ +typedef struct +{ + int type; + union { + struct { + int color_transform; /* Use -1 for unset */ + } jpeg; + struct { + int smask_in_data; + } jpx; + struct { + fz_jbig2_globals *globals; + int embedded; + } jbig2; + struct { + int columns; + int rows; + int k; + int end_of_line; + int encoded_byte_align; + int end_of_block; + int black_is_1; + int damaged_rows_before_error; + } fax; + struct + { + int columns; + int colors; + int predictor; + int bpc; + } + flate; + struct + { + int columns; + int colors; + int predictor; + int bpc; + int early_change; + } lzw; + } u; +} fz_compression_params; + +/** + Buffers of compressed data; typically for the source data + for images. +*/ +typedef struct +{ + fz_compression_params params; + fz_buffer *buffer; +} fz_compressed_buffer; + +/** + Return the storage size used for a buffer and its data. + Used in implementing store handling. + + Never throws exceptions. +*/ +size_t fz_compressed_buffer_size(fz_compressed_buffer *buffer); + +/** + Open a stream to read the decompressed version of a buffer. +*/ +fz_stream *fz_open_compressed_buffer(fz_context *ctx, fz_compressed_buffer *); + +/** + Open a stream to read the decompressed version of a buffer, + with optional log2 subsampling. + + l2factor = NULL for no subsampling, or a pointer to an integer + containing the maximum log2 subsample factor acceptable (0 = + none, 1 = halve dimensions, 2 = quarter dimensions etc). If + non-NULL, then *l2factor will be updated on exit with the actual + log2 subsample factor achieved. +*/ +fz_stream *fz_open_image_decomp_stream_from_buffer(fz_context *ctx, fz_compressed_buffer *, int *l2factor); + +/** + Open a stream to read the decompressed version of another stream + with optional log2 subsampling. +*/ +fz_stream *fz_open_image_decomp_stream(fz_context *ctx, fz_stream *, fz_compression_params *, int *l2factor); + +/** + Recognise image format strings in the first 8 bytes from image + data. +*/ +int fz_recognize_image_format(fz_context *ctx, unsigned char p[8]); + +/** + Map from FZ_IMAGE_* value to string. + + The returned string is static and therefore must not be freed. +*/ +const char *fz_image_type_name(int type); + +/** + Map from (case sensitive) image type string to FZ_IMAGE_* + type value. +*/ +int fz_lookup_image_type(const char *type); + +enum +{ + FZ_IMAGE_UNKNOWN = 0, + + /* Uncompressed samples */ + FZ_IMAGE_RAW, + + /* Compressed samples */ + FZ_IMAGE_FAX, + FZ_IMAGE_FLATE, + FZ_IMAGE_LZW, + FZ_IMAGE_RLD, + + /* Full image formats */ + FZ_IMAGE_BMP, + FZ_IMAGE_GIF, + FZ_IMAGE_JBIG2, + FZ_IMAGE_JPEG, + FZ_IMAGE_JPX, + FZ_IMAGE_JXR, + FZ_IMAGE_PNG, + FZ_IMAGE_PNM, + FZ_IMAGE_TIFF, + FZ_IMAGE_PSD, +}; + +/** + Drop a reference to a compressed buffer. Destroys the buffer + and frees any storage/other references held by it. + + Never throws exceptions. +*/ +void fz_drop_compressed_buffer(fz_context *ctx, fz_compressed_buffer *buf); + +#endif diff --git a/include/mupdf/fitz/config.h b/include/mupdf/fitz/config.h new file mode 100644 index 0000000..aeef671 --- /dev/null +++ b/include/mupdf/fitz/config.h @@ -0,0 +1,222 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef FZ_CONFIG_H + +#define FZ_CONFIG_H + +/** + Enable the following for spot (and hence overprint/overprint + simulation) capable rendering. This forces FZ_PLOTTERS_N on. +*/ +/* #define FZ_ENABLE_SPOT_RENDERING 1 */ + +/** + Choose which plotters we need. + By default we build all the plotters in. To avoid building + plotters in that aren't needed, define the unwanted + FZ_PLOTTERS_... define to 0. +*/ +/* #define FZ_PLOTTERS_G 1 */ +/* #define FZ_PLOTTERS_RGB 1 */ +/* #define FZ_PLOTTERS_CMYK 1 */ +/* #define FZ_PLOTTERS_N 1 */ + +/** + Choose which document agents to include. + By default all are enabled. To avoid building unwanted + ones, define FZ_ENABLE_... to 0. +*/ +/* #define FZ_ENABLE_PDF 1 */ +/* #define FZ_ENABLE_XPS 1 */ +/* #define FZ_ENABLE_SVG 1 */ +/* #define FZ_ENABLE_CBZ 1 */ +/* #define FZ_ENABLE_IMG 1 */ +/* #define FZ_ENABLE_HTML 1 */ +/* #define FZ_ENABLE_EPUB 1 */ + +/** + Choose which document writers to include. + By default all are enabled. To avoid building unwanted + ones, define FZ_ENABLE_..._OUTPUT to 0. +*/ +/* #define FZ_ENABLE_OCR_OUTPUT 1 */ +/* #define FZ_ENABLE_DOCX_OUTPUT 1 */ +/* #define FZ_ENABLE_ODT_OUTPUT 1 */ + +/** + Choose whether to enable ICC color profiles. +*/ +/* #define FZ_ENABLE_ICC 1 */ + +/** + Choose whether to enable JPEG2000 decoding. + By default, it is enabled, but due to frequent security + issues with the third party libraries we support disabling + it with this flag. +*/ +/* #define FZ_ENABLE_JPX 1 */ + +/** + Choose whether to enable JavaScript. + By default JavaScript is enabled both for mutool and PDF + interactivity. +*/ +/* #define FZ_ENABLE_JS 1 */ + +/** + Choose which fonts to include. + By default we include the base 14 PDF fonts, + DroidSansFallback from Android for CJK, and + Charis SIL from SIL for epub/html. + Enable the following defines to AVOID including + unwanted fonts. +*/ +/* To avoid all noto fonts except CJK, enable: */ +#define TOFU 1 + +/* To skip the CJK font, enable: (this implicitly enables TOFU_CJK_EXT + * and TOFU_CJK_LANG) */ +#define TOFU_CJK 1 + +/* To skip CJK Extension A, enable: (this implicitly enables + * TOFU_CJK_LANG) */ +/* #define TOFU_CJK_EXT */ + +/* To skip CJK language specific fonts, enable: */ +/* #define TOFU_CJK_LANG */ + +/* To skip the Emoji font, enable: */ +/* #define TOFU_EMOJI */ + +/* To skip the ancient/historic scripts, enable: */ +/* #define TOFU_HISTORIC */ + +/* To skip the symbol font, enable: */ +/* #define TOFU_SYMBOL */ + +/* To skip the SIL fonts, enable: */ +/* #define TOFU_SIL */ + +/* To skip the Base14 fonts, enable: */ +/* #define TOFU_BASE14 */ +/* (You probably really don't want to do that except for measurement + * purposes!) */ + +/* ---------- DO NOT EDIT ANYTHING UNDER THIS LINE ---------- */ + +#ifndef FZ_ENABLE_SPOT_RENDERING +#define FZ_ENABLE_SPOT_RENDERING 1 +#endif + +#if FZ_ENABLE_SPOT_RENDERING +#undef FZ_PLOTTERS_N +#define FZ_PLOTTERS_N 1 +#endif /* FZ_ENABLE_SPOT_RENDERING */ + +#ifndef FZ_PLOTTERS_G +#define FZ_PLOTTERS_G 1 +#endif /* FZ_PLOTTERS_G */ + +#ifndef FZ_PLOTTERS_RGB +#define FZ_PLOTTERS_RGB 1 +#endif /* FZ_PLOTTERS_RGB */ + +#ifndef FZ_PLOTTERS_CMYK +#define FZ_PLOTTERS_CMYK 1 +#endif /* FZ_PLOTTERS_CMYK */ + +#ifndef FZ_PLOTTERS_N +#define FZ_PLOTTERS_N 1 +#endif /* FZ_PLOTTERS_N */ + +/* We need at least 1 plotter defined */ +#if FZ_PLOTTERS_G == 0 && FZ_PLOTTERS_RGB == 0 && FZ_PLOTTERS_CMYK == 0 +#undef FZ_PLOTTERS_N +#define FZ_PLOTTERS_N 1 +#endif + +#ifndef FZ_ENABLE_PDF +#define FZ_ENABLE_PDF 1 +#endif /* FZ_ENABLE_PDF */ + +#ifndef FZ_ENABLE_XPS +#define FZ_ENABLE_XPS 1 +#endif /* FZ_ENABLE_XPS */ + +#ifndef FZ_ENABLE_SVG +#define FZ_ENABLE_SVG 1 +#endif /* FZ_ENABLE_SVG */ + +#ifndef FZ_ENABLE_CBZ +#define FZ_ENABLE_CBZ 1 +#endif /* FZ_ENABLE_CBZ */ + +#ifndef FZ_ENABLE_IMG +#define FZ_ENABLE_IMG 1 +#endif /* FZ_ENABLE_IMG */ + +#ifndef FZ_ENABLE_HTML +#define FZ_ENABLE_HTML 1 +#endif /* FZ_ENABLE_HTML */ + +#ifndef FZ_ENABLE_EPUB +#define FZ_ENABLE_EPUB 1 +#endif /* FZ_ENABLE_EPUB */ + +#ifndef FZ_ENABLE_OCR_OUTPUT +#define FZ_ENABLE_OCR_OUTPUT 1 +#endif /* FZ_ENABLE_OCR_OUTPUT */ + +#ifndef FZ_ENABLE_ODT_OUTPUT +#define FZ_ENABLE_ODT_OUTPUT 1 +#endif /* FZ_ENABLE_ODT_OUTPUT */ + +#ifndef FZ_ENABLE_DOCX_OUTPUT +#define FZ_ENABLE_DOCX_OUTPUT 1 +#endif /* FZ_ENABLE_DOCX_OUTPUT */ + +#ifndef FZ_ENABLE_JPX +#define FZ_ENABLE_JPX 1 +#endif /* FZ_ENABLE_JPX */ + +#ifndef FZ_ENABLE_JS +#define FZ_ENABLE_JS 1 +#endif /* FZ_ENABLE_JS */ + +#ifndef FZ_ENABLE_ICC +#define FZ_ENABLE_ICC 1 +#endif /* FZ_ENABLE_ICC */ + +/* If Epub and HTML are both disabled, disable SIL fonts */ +#if FZ_ENABLE_HTML == 0 && FZ_ENABLE_EPUB == 0 +#undef TOFU_SIL +#define TOFU_SIL +#endif + +#if !defined(HAVE_LEPTONICA) || !defined(HAVE_TESSERACT) +#ifndef OCR_DISABLED +#define OCR_DISABLED +#endif +#endif + +#endif /* FZ_CONFIG_H */ diff --git a/include/mupdf/fitz/context.h b/include/mupdf/fitz/context.h new file mode 100644 index 0000000..a000a53 --- /dev/null +++ b/include/mupdf/fitz/context.h @@ -0,0 +1,942 @@ +// Copyright (C) 2004-2022 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_CONTEXT_H +#define MUPDF_FITZ_CONTEXT_H + +#include "mupdf/fitz/version.h" +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/geometry.h" + + +#ifndef FZ_VERBOSE_EXCEPTIONS +#define FZ_VERBOSE_EXCEPTIONS 0 +#endif + +typedef struct fz_font_context fz_font_context; +typedef struct fz_colorspace_context fz_colorspace_context; +typedef struct fz_style_context fz_style_context; +typedef struct fz_tuning_context fz_tuning_context; +typedef struct fz_store fz_store; +typedef struct fz_glyph_cache fz_glyph_cache; +typedef struct fz_document_handler_context fz_document_handler_context; +typedef struct fz_output fz_output; +typedef struct fz_context fz_context; + +/** + Allocator structure; holds callbacks and private data pointer. +*/ +typedef struct +{ + void *user; + void *(*malloc)(void *, size_t); + void *(*realloc)(void *, void *, size_t); + void (*free)(void *, void *); +} fz_alloc_context; + +/** + Exception macro definitions. Just treat these as a black box - + pay no attention to the man behind the curtain. +*/ +#define fz_var(var) fz_var_imp((void *)&(var)) +#define fz_try(ctx) if (!fz_setjmp(*fz_push_try(ctx))) if (fz_do_try(ctx)) do +#define fz_always(ctx) while (0); if (fz_do_always(ctx)) do +#define fz_catch(ctx) while (0); if (fz_do_catch(ctx)) + +/** + These macros provide a simple exception handling system. Use them as + follows: + + fz_try(ctx) + ... + fz_catch(ctx) + ... + + or as: + + fz_try(ctx) + ... + fz_always(ctx) + ... + fz_catch(ctx) + ... + + Code within the fz_try() section can then throw exceptions using fz_throw() + (or fz_vthrow()). + + They are implemented with setjmp/longjmp, which can have unfortunate + consequences for 'losing' local variable values on a throw. To avoid this + we recommend calling 'fz_var(variable)' before the fz_try() for any + local variable whose value may change within the fz_try() block and whose + value will be required afterwards. + + Do not call anything in the fz_always() section that can throw. + + Any exception can be rethrown from the fz_catch() section using fz_rethrow() + as long as there has been no intervening use of fz_try/fz_catch. +*/ + +/** + Throw an exception. + + This assumes an enclosing fz_try() block within the callstack. +*/ +FZ_NORETURN void fz_vthrow(fz_context *ctx, int errcode, const char *, va_list ap); +FZ_NORETURN void fz_throw(fz_context *ctx, int errcode, const char *, ...) FZ_PRINTFLIKE(3,4); +FZ_NORETURN void fz_rethrow(fz_context *ctx); + +/** + Called within a catch block this modifies the current + exception's code. If it's of type 'fromcode' it is + modified to 'tocode'. Typically used for 'downgrading' + exception severity. +*/ +void fz_morph_error(fz_context *ctx, int fromcode, int tocode); + +/** + Log a warning. + + This goes to the registered warning stream (stderr by + default). +*/ +void fz_vwarn(fz_context *ctx, const char *fmt, va_list ap); +void fz_warn(fz_context *ctx, const char *fmt, ...) FZ_PRINTFLIKE(2,3); + +/** + Within an fz_catch() block, retrieve the formatted message + string for the current exception. + + This assumes no intervening use of fz_try/fz_catch. +*/ +const char *fz_caught_message(fz_context *ctx); + +/** + Within an fz_catch() block, retrieve the error code for + the current exception. + + This assumes no intervening use of fz_try/fz_catch. +*/ +int fz_caught(fz_context *ctx); + +/** + Within an fz_catch() block, rethrow the current exception + if the errcode of the current exception matches. + + This assumes no intervening use of fz_try/fz_catch. +*/ +void fz_rethrow_if(fz_context *ctx, int errcode); + +/** + Format an error message, and log it to the registered + error stream (stderr by default). +*/ +void fz_log_error_printf(fz_context *ctx, const char *fmt, ...) FZ_PRINTFLIKE(2,3); +void fz_vlog_error_printf(fz_context *ctx, const char *fmt, va_list ap); + +/** + Log a (preformatted) string to the registered + error stream (stderr by default). +*/ +void fz_log_error(fz_context *ctx, const char *str); + +void fz_start_throw_on_repair(fz_context *ctx); +void fz_end_throw_on_repair(fz_context *ctx); + +/** + Now, a debugging feature. If FZ_VERBOSE_EXCEPTIONS is 1 then + some of the above functions are replaced by versions that print + FILE and LINE information. +*/ +#if FZ_VERBOSE_EXCEPTIONS +#define fz_vthrow(CTX, ERRCODE, FMT, VA) fz_vthrowFL(CTX, __FILE__, __LINE__, ERRCODE, FMT, VA) +#define fz_throw(CTX, ERRCODE, ...) fz_throwFL(CTX, __FILE__, __LINE__, ERRCODE, __VA_ARGS__) +#define fz_rethrow(CTX) fz_rethrowFL(CTX, __FILE__, __LINE__) +#define fz_morph_error(CTX, FROM, TO) fz_morph_errorFL(CTX, __FILE__, __LINE__, FROM, TO) +#define fz_vwarn(CTX, FMT, VA) fz_vwarnFL(CTX, __FILE__, __LINE__, FMT, VA) +#define fz_warn(CTX, ...) fz_warnFL(CTX, __FILE__, __LINE__, __VA_ARGS__) +#define fz_rethrow_if(CTX, ERRCODE) fz_rethrow_ifFL(CTX, __FILE__, __LINE__, ERRCODE) +#define fz_log_error_printf(CTX, ...) fz_log_error_printfFL(CTX, __FILE__, __LINE__, __VA_ARGS__) +#define fz_vlog_error_printf(CTX, FMT, VA) fz_log_error_printfFL(CTX, __FILE__, __LINE__, FMT, VA) +#define fz_log_error(CTX, STR) fz_log_error_printfFL(CTX, __FILE__, __LINE__, STR) +FZ_NORETURN void fz_vthrowFL(fz_context *ctx, const char *file, int line, int errcode, const char *fmt, va_list ap); +FZ_NORETURN void fz_throwFL(fz_context *ctx, const char *file, int line, int errcode, const char *fmt, ...) FZ_PRINTFLIKE(5,6); +FZ_NORETURN void fz_rethrowFL(fz_context *ctx, const char *file, int line); +void fz_morph_errorFL(fz_context *ctx, const char *file, int line, int fromcode, int tocode); +void fz_vwarnFL(fz_context *ctx, const char *file, int line, const char *fmt, va_list ap); +void fz_warnFL(fz_context *ctx, const char *file, int line, const char *fmt, ...) FZ_PRINTFLIKE(4,5); +void fz_rethrow_ifFL(fz_context *ctx, const char *file, int line, int errcode); +void fz_log_error_printfFL(fz_context *ctx, const char *file, int line, const char *fmt, ...) FZ_PRINTFLIKE(4,5); +void fz_vlog_error_printfFL(fz_context *ctx, const char *file, int line, const char *fmt, va_list ap); +void fz_log_errorFL(fz_context *ctx, const char *file, int line, const char *str); +#endif + +enum +{ + FZ_ERROR_NONE = 0, + FZ_ERROR_MEMORY = 1, + FZ_ERROR_GENERIC = 2, + FZ_ERROR_SYNTAX = 3, + FZ_ERROR_MINOR = 4, + FZ_ERROR_TRYLATER = 5, + FZ_ERROR_ABORT = 6, + FZ_ERROR_REPAIRED = 7, + FZ_ERROR_COUNT +}; + +/** + Flush any repeated warnings. + + Repeated warnings are buffered, counted and eventually printed + along with the number of repetitions. Call fz_flush_warnings + to force printing of the latest buffered warning and the + number of repetitions, for example to make sure that all + warnings are printed before exiting an application. +*/ +void fz_flush_warnings(fz_context *ctx); + +/** + Locking functions + + MuPDF is kept deliberately free of any knowledge of particular + threading systems. As such, in order for safe multi-threaded + operation, we rely on callbacks to client provided functions. + + A client is expected to provide FZ_LOCK_MAX number of mutexes, + and a function to lock/unlock each of them. These may be + recursive mutexes, but do not have to be. + + If a client does not intend to use multiple threads, then it + may pass NULL instead of a lock structure. + + In order to avoid deadlocks, we have one simple rule + internally as to how we use locks: We can never take lock n + when we already hold any lock i, where 0 <= i <= n. In order + to verify this, we have some debugging code, that can be + enabled by defining FITZ_DEBUG_LOCKING. +*/ + +typedef struct +{ + void *user; + void (*lock)(void *user, int lock); + void (*unlock)(void *user, int lock); +} fz_locks_context; + +enum { + FZ_LOCK_ALLOC = 0, + FZ_LOCK_FREETYPE, + FZ_LOCK_GLYPHCACHE, + FZ_LOCK_MAX +}; + +#if defined(MEMENTO) || !defined(NDEBUG) +#define FITZ_DEBUG_LOCKING +#endif + +#ifdef FITZ_DEBUG_LOCKING + +void fz_assert_lock_held(fz_context *ctx, int lock); +void fz_assert_lock_not_held(fz_context *ctx, int lock); +void fz_lock_debug_lock(fz_context *ctx, int lock); +void fz_lock_debug_unlock(fz_context *ctx, int lock); + +#else + +#define fz_assert_lock_held(A,B) do { } while (0) +#define fz_assert_lock_not_held(A,B) do { } while (0) +#define fz_lock_debug_lock(A,B) do { } while (0) +#define fz_lock_debug_unlock(A,B) do { } while (0) + +#endif /* !FITZ_DEBUG_LOCKING */ + +/** + Specifies the maximum size in bytes of the resource store in + fz_context. Given as argument to fz_new_context. + + FZ_STORE_UNLIMITED: Let resource store grow unbounded. + + FZ_STORE_DEFAULT: A reasonable upper bound on the size, for + devices that are not memory constrained. +*/ +enum { + FZ_STORE_UNLIMITED = 0, + FZ_STORE_DEFAULT = 256 << 20, +}; + +/** + Allocate context containing global state. + + The global state contains an exception stack, resource store, + etc. Most functions in MuPDF take a context argument to be + able to reference the global state. See fz_drop_context for + freeing an allocated context. + + alloc: Supply a custom memory allocator through a set of + function pointers. Set to NULL for the standard library + allocator. The context will keep the allocator pointer, so the + data it points to must not be modified or freed during the + lifetime of the context. + + locks: Supply a set of locks and functions to lock/unlock + them, intended for multi-threaded applications. Set to NULL + when using MuPDF in a single-threaded applications. The + context will keep the locks pointer, so the data it points to + must not be modified or freed during the lifetime of the + context. + + max_store: Maximum size in bytes of the resource store, before + it will start evicting cached resources such as fonts and + images. FZ_STORE_UNLIMITED can be used if a hard limit is not + desired. Use FZ_STORE_DEFAULT to get a reasonable size. + + May return NULL. +*/ +#define fz_new_context(alloc, locks, max_store) fz_new_context_imp(alloc, locks, max_store, FZ_VERSION) + +/** + Make a clone of an existing context. + + This function is meant to be used in multi-threaded + applications where each thread requires its own context, yet + parts of the global state, for example caching, are shared. + + ctx: Context obtained from fz_new_context to make a copy of. + ctx must have had locks and lock/functions setup when created. + The two contexts will share the memory allocator, resource + store, locks and lock/unlock functions. They will each have + their own exception stacks though. + + May return NULL. +*/ +fz_context *fz_clone_context(fz_context *ctx); + +/** + Free a context and its global state. + + The context and all of its global state is freed, and any + buffered warnings are flushed (see fz_flush_warnings). If NULL + is passed in nothing will happen. + + Must not be called for a context that is being used in an active + fz_try(), fz_always() or fz_catch() block. +*/ +void fz_drop_context(fz_context *ctx); + +/** + Set the user field in the context. + + NULL initially, this field can be set to any opaque value + required by the user. It is copied on clones. +*/ +void fz_set_user_context(fz_context *ctx, void *user); + +/** + Read the user field from the context. +*/ +void *fz_user_context(fz_context *ctx); + +/** + FIXME: Better not to expose fz_default_error_callback, and + fz_default_warning callback and to allow 'NULL' to be used + int fz_set_xxxx_callback to mean "defaults". + + FIXME: Do we need/want functions like + fz_error_callback(ctx, message) to allow callers to inject + stuff into the error/warning streams? +*/ +/** + The default error callback. Declared publicly just so that the + error callback can be set back to this after it has been + overridden. +*/ +void fz_default_error_callback(void *user, const char *message); + +/** + The default warning callback. Declared publicly just so that + the warning callback can be set back to this after it has been + overridden. +*/ +void fz_default_warning_callback(void *user, const char *message); + +/** + A callback called whenever an error message is generated. + The user pointer passed to fz_set_error_callback() is passed + along with the error message. +*/ +typedef void (fz_error_cb)(void *user, const char *message); + +/** + A callback called whenever a warning message is generated. + The user pointer passed to fz_set_warning_callback() is + passed along with the warning message. +*/ +typedef void (fz_warning_cb)(void *user, const char *message); + +/** + Set the error callback. This will be called as part of the + exception handling. + + The callback must not throw exceptions! +*/ +void fz_set_error_callback(fz_context *ctx, fz_error_cb *error_cb, void *user); + +/** + Retrieve the currently set error callback, or NULL if none + has been set. Optionally, if user is non-NULL, the user pointer + given when the warning callback was set is also passed back to + the caller. +*/ +fz_error_cb *fz_error_callback(fz_context *ctx, void **user); + +/** + Set the warning callback. This will be called as part of the + exception handling. + + The callback must not throw exceptions! +*/ +void fz_set_warning_callback(fz_context *ctx, fz_warning_cb *warning_cb, void *user); + +/** + Retrieve the currently set warning callback, or NULL if none + has been set. Optionally, if user is non-NULL, the user pointer + given when the warning callback was set is also passed back to + the caller. +*/ +fz_warning_cb *fz_warning_callback(fz_context *ctx, void **user); + +/** + In order to tune MuPDF's behaviour, certain functions can + (optionally) be provided by callers. +*/ + +/** + Given the width and height of an image, + the subsample factor, and the subarea of the image actually + required, the caller can decide whether to decode the whole + image or just a subarea. + + arg: The caller supplied opaque argument. + + w, h: The width/height of the complete image. + + l2factor: The log2 factor for subsampling (i.e. image will be + decoded to (w>>l2factor, h>>l2factor)). + + subarea: The actual subarea required for the current operation. + The tuning function is allowed to increase this in size if + required. +*/ +typedef void (fz_tune_image_decode_fn)(void *arg, int w, int h, int l2factor, fz_irect *subarea); + +/** + Given the source width and height of + image, together with the actual required width and height, + decide whether we should use mitchell scaling. + + arg: The caller supplied opaque argument. + + dst_w, dst_h: The actual width/height required on the target + device. + + src_w, src_h: The source width/height of the image. + + Return 0 not to use the Mitchell scaler, 1 to use the Mitchell + scaler. All other values reserved. +*/ +typedef int (fz_tune_image_scale_fn)(void *arg, int dst_w, int dst_h, int src_w, int src_h); + +/** + Set the tuning function to use for + image decode. + + image_decode: Function to use. + + arg: Opaque argument to be passed to tuning function. +*/ +void fz_tune_image_decode(fz_context *ctx, fz_tune_image_decode_fn *image_decode, void *arg); + +/** + Set the tuning function to use for + image scaling. + + image_scale: Function to use. + + arg: Opaque argument to be passed to tuning function. +*/ +void fz_tune_image_scale(fz_context *ctx, fz_tune_image_scale_fn *image_scale, void *arg); + +/** + Get the number of bits of antialiasing we are + using (for graphics). Between 0 and 8. +*/ +int fz_aa_level(fz_context *ctx); + +/** + Set the number of bits of antialiasing we should + use (for both text and graphics). + + bits: The number of bits of antialiasing to use (values are + clamped to within the 0 to 8 range). +*/ +void fz_set_aa_level(fz_context *ctx, int bits); + +/** + Get the number of bits of antialiasing we are + using for text. Between 0 and 8. +*/ +int fz_text_aa_level(fz_context *ctx); + +/** + Set the number of bits of antialiasing we + should use for text. + + bits: The number of bits of antialiasing to use (values are + clamped to within the 0 to 8 range). +*/ +void fz_set_text_aa_level(fz_context *ctx, int bits); + +/** + Get the number of bits of antialiasing we are + using for graphics. Between 0 and 8. +*/ +int fz_graphics_aa_level(fz_context *ctx); + +/** + Set the number of bits of antialiasing we + should use for graphics. + + bits: The number of bits of antialiasing to use (values are + clamped to within the 0 to 8 range). +*/ +void fz_set_graphics_aa_level(fz_context *ctx, int bits); + +/** + Get the minimum line width to be + used for stroked lines. + + min_line_width: The minimum line width to use (in pixels). +*/ +float fz_graphics_min_line_width(fz_context *ctx); + +/** + Set the minimum line width to be + used for stroked lines. + + min_line_width: The minimum line width to use (in pixels). +*/ +void fz_set_graphics_min_line_width(fz_context *ctx, float min_line_width); + +/** + Get the user stylesheet source text. +*/ +const char *fz_user_css(fz_context *ctx); + +/** + Set the user stylesheet source text for use with HTML and EPUB. +*/ +void fz_set_user_css(fz_context *ctx, const char *text); + +/** + Return whether to respect document styles in HTML and EPUB. +*/ +int fz_use_document_css(fz_context *ctx); + +/** + Toggle whether to respect document styles in HTML and EPUB. +*/ +void fz_set_use_document_css(fz_context *ctx, int use); + +/** + Enable icc profile based operation. +*/ +void fz_enable_icc(fz_context *ctx); + +/** + Disable icc profile based operation. +*/ +void fz_disable_icc(fz_context *ctx); + +/** + Memory Allocation and Scavenging: + + All calls to MuPDF's allocator functions pass through to the + underlying allocators passed in when the initial context is + created, after locks are taken (using the supplied locking + function) to ensure that only one thread at a time calls + through. + + If the underlying allocator fails, MuPDF attempts to make room + for the allocation by evicting elements from the store, then + retrying. + + Any call to allocate may then result in several calls to the + underlying allocator, and result in elements that are only + referred to by the store being freed. +*/ + +/** + Allocate memory for a structure, clear it, and tag the pointer + for Memento. + + Throws exception in the event of failure to allocate. +*/ +#define fz_malloc_struct(CTX, TYPE) \ + ((TYPE*)Memento_label(fz_calloc(CTX, 1, sizeof(TYPE)), #TYPE)) + +/** + Allocate memory for an array of structures, clear it, and tag + the pointer for Memento. + + Throws exception in the event of failure to allocate. +*/ +#define fz_malloc_struct_array(CTX, N, TYPE) \ + ((TYPE*)Memento_label(fz_calloc(CTX, N, sizeof(TYPE)), #TYPE "[]")) + +/** + Allocate uninitialized memory for an array of structures, and + tag the pointer for Memento. Does NOT clear the memory! + + Throws exception in the event of failure to allocate. +*/ +#define fz_malloc_array(CTX, COUNT, TYPE) \ + ((TYPE*)Memento_label(fz_malloc(CTX, (COUNT) * sizeof(TYPE)), #TYPE "[]")) +#define fz_realloc_array(CTX, OLD, COUNT, TYPE) \ + ((TYPE*)Memento_label(fz_realloc(CTX, OLD, (COUNT) * sizeof(TYPE)), #TYPE "[]")) + +/** + Allocate uninitialized memory of a given size. + Does NOT clear the memory! + + May return NULL for size = 0. + + Throws exception in the event of failure to allocate. +*/ +void *fz_malloc(fz_context *ctx, size_t size); + +/** + Allocate array of memory of count entries of size bytes. + Clears the memory to zero. + + Throws exception in the event of failure to allocate. +*/ +void *fz_calloc(fz_context *ctx, size_t count, size_t size); + +/** + Reallocates a block of memory to given size. Existing contents + up to min(old_size,new_size) are maintained. The rest of the + block is uninitialised. + + fz_realloc(ctx, NULL, size) behaves like fz_malloc(ctx, size). + + fz_realloc(ctx, p, 0); behaves like fz_free(ctx, p). + + Throws exception in the event of failure to allocate. +*/ +void *fz_realloc(fz_context *ctx, void *p, size_t size); + +/** + Free a previously allocated block of memory. + + fz_free(ctx, NULL) does nothing. + + Never throws exceptions. +*/ +void fz_free(fz_context *ctx, void *p); + +/** + fz_malloc equivalent that returns NULL rather than throwing + exceptions. +*/ +void *fz_malloc_no_throw(fz_context *ctx, size_t size); + +/** + fz_calloc equivalent that returns NULL rather than throwing + exceptions. +*/ +void *fz_calloc_no_throw(fz_context *ctx, size_t count, size_t size); + +/** + fz_realloc equivalent that returns NULL rather than throwing + exceptions. +*/ +void *fz_realloc_no_throw(fz_context *ctx, void *p, size_t size); + +/** + Portable strdup implementation, using fz allocators. +*/ +char *fz_strdup(fz_context *ctx, const char *s); + +/** + Fill block with len bytes of pseudo-randomness. +*/ +void fz_memrnd(fz_context *ctx, uint8_t *block, int len); + + +/* Implementation details: subject to change. */ + +/* Implementations exposed for speed, but considered private. */ + +void fz_var_imp(void *); +fz_jmp_buf *fz_push_try(fz_context *ctx); +int fz_do_try(fz_context *ctx); +int fz_do_always(fz_context *ctx); +int fz_do_catch(fz_context *ctx); + +#ifndef FZ_JMPBUF_ALIGN +#define FZ_JMPBUF_ALIGN 32 +#endif + +typedef struct +{ + fz_jmp_buf buffer; + int state, code; + char padding[FZ_JMPBUF_ALIGN-sizeof(int)*2]; +} fz_error_stack_slot; + +typedef struct +{ + fz_error_stack_slot *top; + fz_error_stack_slot stack[256]; + fz_error_stack_slot padding; + fz_error_stack_slot *stack_base; + int errcode; + void *print_user; + void (*print)(void *user, const char *message); + char message[256]; +} fz_error_context; + +typedef struct +{ + void *print_user; + void (*print)(void *user, const char *message); + int count; + char message[256]; +} fz_warn_context; + +typedef struct +{ + int hscale; + int vscale; + int scale; + int bits; + int text_bits; + float min_line_width; +} fz_aa_context; + +struct fz_context +{ + void *user; + fz_alloc_context alloc; + fz_locks_context locks; + fz_error_context error; + fz_warn_context warn; + + /* unshared contexts */ + fz_aa_context aa; + uint16_t seed48[7]; +#if FZ_ENABLE_ICC + int icc_enabled; +#endif + int throw_on_repair; + + /* TODO: should these be unshared? */ + fz_document_handler_context *handler; + fz_style_context *style; + fz_tuning_context *tuning; + + /* shared contexts */ + fz_output *stddbg; + fz_font_context *font; + fz_colorspace_context *colorspace; + fz_store *store; + fz_glyph_cache *glyph_cache; +}; + +fz_context *fz_new_context_imp(const fz_alloc_context *alloc, const fz_locks_context *locks, size_t max_store, const char *version); + +/** + Lock one of the user supplied mutexes. +*/ +static inline void +fz_lock(fz_context *ctx, int lock) +{ + fz_lock_debug_lock(ctx, lock); + ctx->locks.lock(ctx->locks.user, lock); +} + +/** + Unlock one of the user supplied mutexes. +*/ +static inline void +fz_unlock(fz_context *ctx, int lock) +{ + fz_lock_debug_unlock(ctx, lock); + ctx->locks.unlock(ctx->locks.user, lock); +} + +/* Lock-safe reference counting functions */ + +static inline void * +fz_keep_imp(fz_context *ctx, void *p, int *refs) +{ + if (p) + { + (void)Memento_checkIntPointerOrNull(refs); + fz_lock(ctx, FZ_LOCK_ALLOC); + if (*refs > 0) + { + (void)Memento_takeRef(p); + ++*refs; + } + fz_unlock(ctx, FZ_LOCK_ALLOC); + } + return p; +} + +static inline void * +fz_keep_imp_locked(fz_context *ctx FZ_UNUSED, void *p, int *refs) +{ + if (p) + { + (void)Memento_checkIntPointerOrNull(refs); + if (*refs > 0) + { + (void)Memento_takeRef(p); + ++*refs; + } + } + return p; +} + +static inline void * +fz_keep_imp8_locked(fz_context *ctx FZ_UNUSED, void *p, int8_t *refs) +{ + if (p) + { + (void)Memento_checkIntPointerOrNull(refs); + if (*refs > 0) + { + (void)Memento_takeRef(p); + ++*refs; + } + } + return p; +} + +static inline void * +fz_keep_imp8(fz_context *ctx, void *p, int8_t *refs) +{ + if (p) + { + (void)Memento_checkBytePointerOrNull(refs); + fz_lock(ctx, FZ_LOCK_ALLOC); + if (*refs > 0) + { + (void)Memento_takeRef(p); + ++*refs; + } + fz_unlock(ctx, FZ_LOCK_ALLOC); + } + return p; +} + +static inline void * +fz_keep_imp16(fz_context *ctx, void *p, int16_t *refs) +{ + if (p) + { + (void)Memento_checkShortPointerOrNull(refs); + fz_lock(ctx, FZ_LOCK_ALLOC); + if (*refs > 0) + { + (void)Memento_takeRef(p); + ++*refs; + } + fz_unlock(ctx, FZ_LOCK_ALLOC); + } + return p; +} + +static inline int +fz_drop_imp(fz_context *ctx, void *p, int *refs) +{ + if (p) + { + int drop; + (void)Memento_checkIntPointerOrNull(refs); + fz_lock(ctx, FZ_LOCK_ALLOC); + if (*refs > 0) + { + (void)Memento_dropIntRef(p); + drop = --*refs == 0; + } + else + drop = 0; + fz_unlock(ctx, FZ_LOCK_ALLOC); + return drop; + } + return 0; +} + +static inline int +fz_drop_imp8(fz_context *ctx, void *p, int8_t *refs) +{ + if (p) + { + int drop; + (void)Memento_checkBytePointerOrNull(refs); + fz_lock(ctx, FZ_LOCK_ALLOC); + if (*refs > 0) + { + (void)Memento_dropByteRef(p); + drop = --*refs == 0; + } + else + drop = 0; + fz_unlock(ctx, FZ_LOCK_ALLOC); + return drop; + } + return 0; +} + +static inline int +fz_drop_imp16(fz_context *ctx, void *p, int16_t *refs) +{ + if (p) + { + int drop; + (void)Memento_checkShortPointerOrNull(refs); + fz_lock(ctx, FZ_LOCK_ALLOC); + if (*refs > 0) + { + (void)Memento_dropShortRef(p); + drop = --*refs == 0; + } + else + drop = 0; + fz_unlock(ctx, FZ_LOCK_ALLOC); + return drop; + } + return 0; +} + +#endif diff --git a/include/mupdf/fitz/crypt.h b/include/mupdf/fitz/crypt.h new file mode 100644 index 0000000..e556f3b --- /dev/null +++ b/include/mupdf/fitz/crypt.h @@ -0,0 +1,270 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_CRYPT_H +#define MUPDF_FITZ_CRYPT_H + +#include "mupdf/fitz/system.h" + +/* md5 digests */ + +/** + Structure definition is public to enable stack + based allocation. Do not access the members directly. +*/ +typedef struct +{ + uint32_t lo, hi; + uint32_t a, b, c, d; + unsigned char buffer[64]; +} fz_md5; + +/** + MD5 initialization. Begins an MD5 operation, writing a new + context. + + Never throws an exception. +*/ +void fz_md5_init(fz_md5 *state); + +/** + MD5 block update operation. Continues an MD5 message-digest + operation, processing another message block, and updating the + context. + + Never throws an exception. +*/ +void fz_md5_update(fz_md5 *state, const unsigned char *input, size_t inlen); + +/** + MD5 block update operation. Continues an MD5 message-digest + operation, processing an int64, and updating the context. + + Never throws an exception. +*/ +void fz_md5_update_int64(fz_md5 *state, int64_t i); + +/** + MD5 finalization. Ends an MD5 message-digest operation, writing + the message digest and zeroizing the context. + + Never throws an exception. +*/ +void fz_md5_final(fz_md5 *state, unsigned char digest[16]); + +/* sha-256 digests */ + +/** + Structure definition is public to enable stack + based allocation. Do not access the members directly. +*/ +typedef struct +{ + unsigned int state[8]; + unsigned int count[2]; + union { + unsigned char u8[64]; + unsigned int u32[16]; + } buffer; +} fz_sha256; + +/** + SHA256 initialization. Begins an SHA256 operation, initialising + the supplied context. + + Never throws an exception. +*/ +void fz_sha256_init(fz_sha256 *state); + +/** + SHA256 block update operation. Continues an SHA256 message- + digest operation, processing another message block, and updating + the context. + + Never throws an exception. +*/ +void fz_sha256_update(fz_sha256 *state, const unsigned char *input, size_t inlen); + +/** + MD5 finalization. Ends an MD5 message-digest operation, writing + the message digest and zeroizing the context. + + Never throws an exception. +*/ +void fz_sha256_final(fz_sha256 *state, unsigned char digest[32]); + +/* sha-512 digests */ + +/** + Structure definition is public to enable stack + based allocation. Do not access the members directly. +*/ +typedef struct +{ + uint64_t state[8]; + unsigned int count[2]; + union { + unsigned char u8[128]; + uint64_t u64[16]; + } buffer; +} fz_sha512; + +/** + SHA512 initialization. Begins an SHA512 operation, initialising + the supplied context. + + Never throws an exception. +*/ +void fz_sha512_init(fz_sha512 *state); + +/** + SHA512 block update operation. Continues an SHA512 message- + digest operation, processing another message block, and updating + the context. + + Never throws an exception. +*/ +void fz_sha512_update(fz_sha512 *state, const unsigned char *input, size_t inlen); + +/** + SHA512 finalization. Ends an SHA512 message-digest operation, + writing the message digest and zeroizing the context. + + Never throws an exception. +*/ +void fz_sha512_final(fz_sha512 *state, unsigned char digest[64]); + +/* sha-384 digests */ + +typedef fz_sha512 fz_sha384; + +/** + SHA384 initialization. Begins an SHA384 operation, initialising + the supplied context. + + Never throws an exception. +*/ +void fz_sha384_init(fz_sha384 *state); + +/** + SHA384 block update operation. Continues an SHA384 message- + digest operation, processing another message block, and updating + the context. + + Never throws an exception. +*/ +void fz_sha384_update(fz_sha384 *state, const unsigned char *input, size_t inlen); + +/** + SHA384 finalization. Ends an SHA384 message-digest operation, + writing the message digest and zeroizing the context. + + Never throws an exception. +*/ +void fz_sha384_final(fz_sha384 *state, unsigned char digest[64]); + +/* arc4 crypto */ + +/** + Structure definition is public to enable stack + based allocation. Do not access the members directly. +*/ +typedef struct +{ + unsigned x; + unsigned y; + unsigned char state[256]; +} fz_arc4; + +/** + RC4 initialization. Begins an RC4 operation, writing a new + context. + + Never throws an exception. +*/ +void fz_arc4_init(fz_arc4 *state, const unsigned char *key, size_t len); + +/** + RC4 block encrypt operation; encrypt src into dst (both of + length len) updating the RC4 state as we go. + + Never throws an exception. +*/ +void fz_arc4_encrypt(fz_arc4 *state, unsigned char *dest, const unsigned char *src, size_t len); + +/** + RC4 finalization. Zero the context. + + Never throws an exception. +*/ +void fz_arc4_final(fz_arc4 *state); + +/* AES block cipher implementation from XYSSL */ + +/** + Structure definitions are public to enable stack + based allocation. Do not access the members directly. +*/ +typedef struct +{ + int nr; /* number of rounds */ + uint32_t *rk; /* AES round keys */ + uint32_t buf[68]; /* unaligned data */ +} fz_aes; + +#define FZ_AES_DECRYPT 0 +#define FZ_AES_ENCRYPT 1 + +/** + AES encryption intialisation. Fills in the supplied context + and prepares for encryption using the given key. + + Returns non-zero for error (key size other than 128/192/256). + + Never throws an exception. +*/ +int fz_aes_setkey_enc(fz_aes *ctx, const unsigned char *key, int keysize); + +/** + AES decryption intialisation. Fills in the supplied context + and prepares for decryption using the given key. + + Returns non-zero for error (key size other than 128/192/256). + + Never throws an exception. +*/ +int fz_aes_setkey_dec(fz_aes *ctx, const unsigned char *key, int keysize); + +/** + AES block processing. Encrypts or Decrypts (according to mode, + which must match what was initially set up) length bytes (which + must be a multiple of 16), using (and modifying) the insertion + vector iv, reading from input, and writing to output. + + Never throws an exception. +*/ +void fz_aes_crypt_cbc(fz_aes *ctx, int mode, size_t length, + unsigned char iv[16], + const unsigned char *input, + unsigned char *output ); + +#endif diff --git a/include/mupdf/fitz/device.h b/include/mupdf/fitz/device.h new file mode 100644 index 0000000..b65fa3e --- /dev/null +++ b/include/mupdf/fitz/device.h @@ -0,0 +1,595 @@ +// Copyright (C) 2004-2023 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_DEVICE_H +#define MUPDF_FITZ_DEVICE_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/geometry.h" +#include "mupdf/fitz/image.h" +#include "mupdf/fitz/shade.h" +#include "mupdf/fitz/path.h" +#include "mupdf/fitz/text.h" + +/** + The different format handlers (pdf, xps etc) interpret pages to + a device. These devices can then process the stream of calls + they receive in various ways: + The trace device outputs debugging information for the calls. + The draw device will render them. + The list device stores them in a list to play back later. + The text device performs text extraction and searching. + The bbox device calculates the bounding box for the page. + Other devices can (and will) be written in the future. +*/ +typedef struct fz_device fz_device; + +enum +{ + /* Flags */ + FZ_DEVFLAG_MASK = 1, + FZ_DEVFLAG_COLOR = 2, + FZ_DEVFLAG_UNCACHEABLE = 4, + FZ_DEVFLAG_FILLCOLOR_UNDEFINED = 8, + FZ_DEVFLAG_STROKECOLOR_UNDEFINED = 16, + FZ_DEVFLAG_STARTCAP_UNDEFINED = 32, + FZ_DEVFLAG_DASHCAP_UNDEFINED = 64, + FZ_DEVFLAG_ENDCAP_UNDEFINED = 128, + FZ_DEVFLAG_LINEJOIN_UNDEFINED = 256, + FZ_DEVFLAG_MITERLIMIT_UNDEFINED = 512, + FZ_DEVFLAG_LINEWIDTH_UNDEFINED = 1024, + /* Arguably we should have a bit for the dash pattern itself + * being undefined, but that causes problems; do we assume that + * it should always be set to non-dashing at the start of every + * glyph? */ + FZ_DEVFLAG_BBOX_DEFINED = 2048, + FZ_DEVFLAG_GRIDFIT_AS_TILED = 4096, +}; + +enum +{ + /* PDF 1.4 -- standard separable */ + FZ_BLEND_NORMAL, + FZ_BLEND_MULTIPLY, + FZ_BLEND_SCREEN, + FZ_BLEND_OVERLAY, + FZ_BLEND_DARKEN, + FZ_BLEND_LIGHTEN, + FZ_BLEND_COLOR_DODGE, + FZ_BLEND_COLOR_BURN, + FZ_BLEND_HARD_LIGHT, + FZ_BLEND_SOFT_LIGHT, + FZ_BLEND_DIFFERENCE, + FZ_BLEND_EXCLUSION, + + /* PDF 1.4 -- standard non-separable */ + FZ_BLEND_HUE, + FZ_BLEND_SATURATION, + FZ_BLEND_COLOR, + FZ_BLEND_LUMINOSITY, + + /* For packing purposes */ + FZ_BLEND_MODEMASK = 15, + FZ_BLEND_ISOLATED = 16, + FZ_BLEND_KNOCKOUT = 32 +}; + +/** + Map from (case sensitive) blend mode string to enumeration. +*/ +int fz_lookup_blendmode(const char *name); + +/** + Map from enumeration to blend mode string. + + The string is static, with arbitrary lifespan. +*/ +const char *fz_blendmode_name(int blendmode); + +/** + The device structure is public to allow devices to be + implemented outside of fitz. + + Device methods should always be called using e.g. + fz_fill_path(ctx, dev, ...) rather than + dev->fill_path(ctx, dev, ...) +*/ + +/** + Devices can keep track of containers (clips/masks/groups/tiles) + as they go to save callers having to do it. +*/ +typedef struct +{ + fz_rect scissor; + int type; + int user; +} fz_device_container_stack; + +enum +{ + fz_device_container_stack_is_clip, + fz_device_container_stack_is_mask, + fz_device_container_stack_is_group, + fz_device_container_stack_is_tile, +}; + +/* Structure types */ +typedef enum +{ + FZ_STRUCTURE_INVALID = -1, + + /* Grouping elements (PDF 1.7 - Table 10.20) */ + FZ_STRUCTURE_DOCUMENT, + FZ_STRUCTURE_PART, + FZ_STRUCTURE_ART, + FZ_STRUCTURE_SECT, + FZ_STRUCTURE_DIV, + FZ_STRUCTURE_BLOCKQUOTE, + FZ_STRUCTURE_CAPTION, + FZ_STRUCTURE_TOC, + FZ_STRUCTURE_TOCI, + FZ_STRUCTURE_INDEX, + FZ_STRUCTURE_NONSTRUCT, + FZ_STRUCTURE_PRIVATE, + /* Grouping elements (PDF 2.0 - Table 364) */ + FZ_STRUCTURE_DOCUMENTFRAGMENT, + /* Grouping elements (PDF 2.0 - Table 365) */ + FZ_STRUCTURE_ASIDE, + /* Grouping elements (PDF 2.0 - Table 366) */ + FZ_STRUCTURE_TITLE, + FZ_STRUCTURE_FENOTE, + /* Grouping elements (PDF 2.0 - Table 367) */ + FZ_STRUCTURE_SUB, + + /* Paragraphlike elements (PDF 1.7 - Table 10.21) */ + FZ_STRUCTURE_P, + FZ_STRUCTURE_H, + FZ_STRUCTURE_H1, + FZ_STRUCTURE_H2, + FZ_STRUCTURE_H3, + FZ_STRUCTURE_H4, + FZ_STRUCTURE_H5, + FZ_STRUCTURE_H6, + + /* List elements (PDF 1.7 - Table 10.23) */ + FZ_STRUCTURE_LIST, + FZ_STRUCTURE_LISTITEM, + FZ_STRUCTURE_LABEL, + FZ_STRUCTURE_LISTBODY, + + /* Table elements (PDF 1.7 - Table 10.24) */ + FZ_STRUCTURE_TABLE, + FZ_STRUCTURE_TR, + FZ_STRUCTURE_TH, + FZ_STRUCTURE_TD, + FZ_STRUCTURE_THEAD, + FZ_STRUCTURE_TBODY, + FZ_STRUCTURE_TFOOT, + + /* Inline elements (PDF 1.7 - Table 10.25) */ + FZ_STRUCTURE_SPAN, + FZ_STRUCTURE_QUOTE, + FZ_STRUCTURE_NOTE, + FZ_STRUCTURE_REFERENCE, + FZ_STRUCTURE_BIBENTRY, + FZ_STRUCTURE_CODE, + FZ_STRUCTURE_LINK, + FZ_STRUCTURE_ANNOT, + /* Inline elements (PDF 2.0 - Table 368) */ + FZ_STRUCTURE_EM, + FZ_STRUCTURE_STRONG, + + /* Ruby inline element (PDF 1.7 - Table 10.26) */ + FZ_STRUCTURE_RUBY, + FZ_STRUCTURE_RB, + FZ_STRUCTURE_RT, + FZ_STRUCTURE_RP, + + /* Warichu inline element (PDF 1.7 - Table 10.26) */ + FZ_STRUCTURE_WARICHU, + FZ_STRUCTURE_WT, + FZ_STRUCTURE_WP, + + /* Illustration elements (PDF 1.7 - Table 10.27) */ + FZ_STRUCTURE_FIGURE, + FZ_STRUCTURE_FORMULA, + FZ_STRUCTURE_FORM, + + /* Artifact structure type (PDF 2.0 - Table 375) */ + FZ_STRUCTURE_ARTIFACT +} fz_structure; + +const char *fz_structure_to_string(fz_structure type); +fz_structure fz_structure_from_string(const char *str); + +typedef enum +{ + FZ_METATEXT_ACTUALTEXT, + FZ_METATEXT_ALT, + FZ_METATEXT_ABBREVIATION, + FZ_METATEXT_TITLE +} fz_metatext; + +struct fz_device +{ + int refs; + int hints; + int flags; + + void (*close_device)(fz_context *, fz_device *); + void (*drop_device)(fz_context *, fz_device *); + + void (*fill_path)(fz_context *, fz_device *, const fz_path *, int even_odd, fz_matrix, fz_colorspace *, const float *color, float alpha, fz_color_params ); + void (*stroke_path)(fz_context *, fz_device *, const fz_path *, const fz_stroke_state *, fz_matrix, fz_colorspace *, const float *color, float alpha, fz_color_params ); + void (*clip_path)(fz_context *, fz_device *, const fz_path *, int even_odd, fz_matrix, fz_rect scissor); + void (*clip_stroke_path)(fz_context *, fz_device *, const fz_path *, const fz_stroke_state *, fz_matrix, fz_rect scissor); + + void (*fill_text)(fz_context *, fz_device *, const fz_text *, fz_matrix, fz_colorspace *, const float *color, float alpha, fz_color_params ); + void (*stroke_text)(fz_context *, fz_device *, const fz_text *, const fz_stroke_state *, fz_matrix, fz_colorspace *, const float *color, float alpha, fz_color_params ); + void (*clip_text)(fz_context *, fz_device *, const fz_text *, fz_matrix, fz_rect scissor); + void (*clip_stroke_text)(fz_context *, fz_device *, const fz_text *, const fz_stroke_state *, fz_matrix, fz_rect scissor); + void (*ignore_text)(fz_context *, fz_device *, const fz_text *, fz_matrix ); + + void (*fill_shade)(fz_context *, fz_device *, fz_shade *shd, fz_matrix ctm, float alpha, fz_color_params color_params); + void (*fill_image)(fz_context *, fz_device *, fz_image *img, fz_matrix ctm, float alpha, fz_color_params color_params); + void (*fill_image_mask)(fz_context *, fz_device *, fz_image *img, fz_matrix ctm, fz_colorspace *, const float *color, float alpha, fz_color_params color_params); + void (*clip_image_mask)(fz_context *, fz_device *, fz_image *img, fz_matrix ctm, fz_rect scissor); + + void (*pop_clip)(fz_context *, fz_device *); + + void (*begin_mask)(fz_context *, fz_device *, fz_rect area, int luminosity, fz_colorspace *, const float *bc, fz_color_params ); + void (*end_mask)(fz_context *, fz_device *); + void (*begin_group)(fz_context *, fz_device *, fz_rect area, fz_colorspace *cs, int isolated, int knockout, int blendmode, float alpha); + void (*end_group)(fz_context *, fz_device *); + + int (*begin_tile)(fz_context *, fz_device *, fz_rect area, fz_rect view, float xstep, float ystep, fz_matrix ctm, int id); + void (*end_tile)(fz_context *, fz_device *); + + void (*render_flags)(fz_context *, fz_device *, int set, int clear); + void (*set_default_colorspaces)(fz_context *, fz_device *, fz_default_colorspaces *); + + void (*begin_layer)(fz_context *, fz_device *, const char *layer_name); + void (*end_layer)(fz_context *, fz_device *); + + void (*begin_structure)(fz_context *, fz_device *, fz_structure standard, const char *raw, int uid); + void (*end_structure)(fz_context *, fz_device *); + + void (*begin_metatext)(fz_context *, fz_device *, fz_metatext meta, const char *text); + void (*end_metatext)(fz_context *, fz_device *); + + fz_rect d1_rect; + + int container_len; + int container_cap; + fz_device_container_stack *container; +}; + +/** + Device calls; graphics primitives and containers. +*/ +void fz_fill_path(fz_context *ctx, fz_device *dev, const fz_path *path, int even_odd, fz_matrix ctm, fz_colorspace *colorspace, const float *color, float alpha, fz_color_params color_params); +void fz_stroke_path(fz_context *ctx, fz_device *dev, const fz_path *path, const fz_stroke_state *stroke, fz_matrix ctm, fz_colorspace *colorspace, const float *color, float alpha, fz_color_params color_params); +void fz_clip_path(fz_context *ctx, fz_device *dev, const fz_path *path, int even_odd, fz_matrix ctm, fz_rect scissor); +void fz_clip_stroke_path(fz_context *ctx, fz_device *dev, const fz_path *path, const fz_stroke_state *stroke, fz_matrix ctm, fz_rect scissor); +void fz_fill_text(fz_context *ctx, fz_device *dev, const fz_text *text, fz_matrix ctm, fz_colorspace *colorspace, const float *color, float alpha, fz_color_params color_params); +void fz_stroke_text(fz_context *ctx, fz_device *dev, const fz_text *text, const fz_stroke_state *stroke, fz_matrix ctm, fz_colorspace *colorspace, const float *color, float alpha, fz_color_params color_params); +void fz_clip_text(fz_context *ctx, fz_device *dev, const fz_text *text, fz_matrix ctm, fz_rect scissor); +void fz_clip_stroke_text(fz_context *ctx, fz_device *dev, const fz_text *text, const fz_stroke_state *stroke, fz_matrix ctm, fz_rect scissor); +void fz_ignore_text(fz_context *ctx, fz_device *dev, const fz_text *text, fz_matrix ctm); +void fz_pop_clip(fz_context *ctx, fz_device *dev); +void fz_fill_shade(fz_context *ctx, fz_device *dev, fz_shade *shade, fz_matrix ctm, float alpha, fz_color_params color_params); +void fz_fill_image(fz_context *ctx, fz_device *dev, fz_image *image, fz_matrix ctm, float alpha, fz_color_params color_params); +void fz_fill_image_mask(fz_context *ctx, fz_device *dev, fz_image *image, fz_matrix ctm, fz_colorspace *colorspace, const float *color, float alpha, fz_color_params color_params); +void fz_clip_image_mask(fz_context *ctx, fz_device *dev, fz_image *image, fz_matrix ctm, fz_rect scissor); +void fz_begin_mask(fz_context *ctx, fz_device *dev, fz_rect area, int luminosity, fz_colorspace *colorspace, const float *bc, fz_color_params color_params); +void fz_end_mask(fz_context *ctx, fz_device *dev); +void fz_begin_group(fz_context *ctx, fz_device *dev, fz_rect area, fz_colorspace *cs, int isolated, int knockout, int blendmode, float alpha); +void fz_end_group(fz_context *ctx, fz_device *dev); +void fz_begin_tile(fz_context *ctx, fz_device *dev, fz_rect area, fz_rect view, float xstep, float ystep, fz_matrix ctm); +int fz_begin_tile_id(fz_context *ctx, fz_device *dev, fz_rect area, fz_rect view, float xstep, float ystep, fz_matrix ctm, int id); +void fz_end_tile(fz_context *ctx, fz_device *dev); +void fz_render_flags(fz_context *ctx, fz_device *dev, int set, int clear); +void fz_set_default_colorspaces(fz_context *ctx, fz_device *dev, fz_default_colorspaces *default_cs); +void fz_begin_layer(fz_context *ctx, fz_device *dev, const char *layer_name); +void fz_end_layer(fz_context *ctx, fz_device *dev); +void fz_begin_structure(fz_context *ctx, fz_device *dev, fz_structure standard, const char *raw, int uid); +void fz_end_structure(fz_context *ctx, fz_device *dev); +void fz_begin_metatext(fz_context *ctx, fz_device *dev, fz_metatext meta, const char *text); +void fz_end_metatext(fz_context *ctx, fz_device *dev); + +/** + Devices are created by calls to device implementations, for + instance: foo_new_device(). These will be implemented by calling + fz_new_derived_device(ctx, foo_device) where foo_device is a + structure "derived from" fz_device, for instance + typedef struct { fz_device base; ...extras...} foo_device; +*/ +fz_device *fz_new_device_of_size(fz_context *ctx, int size); +#define fz_new_derived_device(CTX, TYPE) \ + ((TYPE *)Memento_label(fz_new_device_of_size(ctx,sizeof(TYPE)),#TYPE)) + +/** + Signal the end of input, and flush any buffered output. + This is NOT called implicitly on fz_drop_device. This + may throw exceptions. +*/ +void fz_close_device(fz_context *ctx, fz_device *dev); + +/** + Reduce the reference count on a device. When the reference count + reaches zero, the device and its resources will be freed. + Don't forget to call fz_close_device before dropping the device, + or you may get incomplete output! + + Never throws exceptions. +*/ +void fz_drop_device(fz_context *ctx, fz_device *dev); + +/** + Increment the reference count for a device. Returns the same + pointer. + + Never throws exceptions. +*/ +fz_device *fz_keep_device(fz_context *ctx, fz_device *dev); + +/** + Enable (set) hint bits within the hint bitfield for a device. +*/ +void fz_enable_device_hints(fz_context *ctx, fz_device *dev, int hints); + +/** + Disable (clear) hint bits within the hint bitfield for a device. +*/ +void fz_disable_device_hints(fz_context *ctx, fz_device *dev, int hints); + +/** + Find current scissor region as tracked by the device. +*/ +fz_rect fz_device_current_scissor(fz_context *ctx, fz_device *dev); + +enum +{ + /* Hints */ + FZ_DONT_INTERPOLATE_IMAGES = 1, + FZ_NO_CACHE = 2, +}; + +/** + Cookie support - simple communication channel between app/library. +*/ + +/** + Provide two-way communication between application and library. + Intended for multi-threaded applications where one thread is + rendering pages and another thread wants to read progress + feedback or abort a job that takes a long time to finish. The + communication is unsynchronized without locking. + + abort: The application should set this field to 0 before + calling fz_run_page to render a page. At any point when the + page is being rendered the application my set this field to 1 + which will cause the rendering to finish soon. This field is + checked periodically when the page is rendered, but exactly + when is not known, therefore there is no upper bound on + exactly when the rendering will abort. If the application + did not provide a set of locks to fz_new_context, it must also + await the completion of fz_run_page before issuing another + call to fz_run_page. Note that once the application has set + this field to 1 after it called fz_run_page it may not change + the value again. + + progress: Communicates rendering progress back to the + application and is read only. Increments as a page is being + rendered. The value starts out at 0 and is limited to less + than or equal to progress_max, unless progress_max is -1. + + progress_max: Communicates the known upper bound of rendering + back to the application and is read only. The maximum value + that the progress field may take. If there is no known upper + bound on how long the rendering may take this value is -1 and + progress is not limited. Note that the value of progress_max + may change from -1 to a positive value once an upper bound is + known, so take this into consideration when comparing the + value of progress to that of progress_max. + + errors: count of errors during current rendering. + + incomplete: Initially should be set to 0. Will be set to + non-zero if a TRYLATER error is thrown during rendering. +*/ +typedef struct +{ + int abort; + int progress; + size_t progress_max; /* (size_t)-1 for unknown */ + int errors; + int incomplete; +} fz_cookie; + +/** + Create a device to print a debug trace of all device calls. +*/ +fz_device *fz_new_trace_device(fz_context *ctx, fz_output *out); + +/** + Create a device to output raw information. +*/ +fz_device *fz_new_xmltext_device(fz_context *ctx, fz_output *out); + +/** + Create a device to compute the bounding + box of all marks on a page. + + The returned bounding box will be the union of all bounding + boxes of all objects on a page. +*/ +fz_device *fz_new_bbox_device(fz_context *ctx, fz_rect *rectp); + +/** + Create a device to test for features. + + Currently only tests for the presence of non-grayscale colors. + + is_color: Possible values returned: + 0: Definitely greyscale + 1: Probably color (all colors were grey, but there + were images or shadings in a non grey colorspace). + 2: Definitely color + + threshold: The difference from grayscale that will be tolerated. + Typical values to use are either 0 (be exact) and 0.02 (allow an + imperceptible amount of slop). + + options: A set of bitfield options, from the FZ_TEST_OPT set. + + passthrough: A device to pass all calls through to, or NULL. + If set, then the test device can both test and pass through to + an underlying device (like, say, the display list device). This + means that a display list can be created and at the end we'll + know if it's colored or not. + + In the absence of a passthrough device, the device will throw + an exception to stop page interpretation when color is found. +*/ +fz_device *fz_new_test_device(fz_context *ctx, int *is_color, float threshold, int options, fz_device *passthrough); + +enum +{ + /* If set, test every pixel of images exhaustively. + * If clear, just look at colorspaces for images. */ + FZ_TEST_OPT_IMAGES = 1, + + /* If set, test every pixel of shadings. */ + /* If clear, just look at colorspaces for shadings. */ + FZ_TEST_OPT_SHADINGS = 2 +}; + +/** + Create a device to draw on a pixmap. + + dest: Target pixmap for the draw device. See fz_new_pixmap* + for how to obtain a pixmap. The pixmap is not cleared by the + draw device, see fz_clear_pixmap* for how to clear it prior to + calling fz_new_draw_device. Free the device by calling + fz_drop_device. + + transform: Transform from user space in points to device space + in pixels. +*/ +fz_device *fz_new_draw_device(fz_context *ctx, fz_matrix transform, fz_pixmap *dest); + +/** + Create a device to draw on a pixmap. + + dest: Target pixmap for the draw device. See fz_new_pixmap* + for how to obtain a pixmap. The pixmap is not cleared by the + draw device, see fz_clear_pixmap* for how to clear it prior to + calling fz_new_draw_device. Free the device by calling + fz_drop_device. + + transform: Transform from user space in points to device space + in pixels. + + clip: Bounding box to restrict any marking operations of the + draw device. +*/ +fz_device *fz_new_draw_device_with_bbox(fz_context *ctx, fz_matrix transform, fz_pixmap *dest, const fz_irect *clip); + +/** + Create a device to draw on a pixmap. + + dest: Target pixmap for the draw device. See fz_new_pixmap* + for how to obtain a pixmap. The pixmap is not cleared by the + draw device, see fz_clear_pixmap* for how to clear it prior to + calling fz_new_draw_device. Free the device by calling + fz_drop_device. + + transform: Transform from user space in points to device space + in pixels. + + proof_cs: Intermediate color space to map though when mapping to + color space defined by pixmap. +*/ +fz_device *fz_new_draw_device_with_proof(fz_context *ctx, fz_matrix transform, fz_pixmap *dest, fz_colorspace *proof_cs); + +/** + Create a device to draw on a pixmap. + + dest: Target pixmap for the draw device. See fz_new_pixmap* + for how to obtain a pixmap. The pixmap is not cleared by the + draw device, see fz_clear_pixmap* for how to clear it prior to + calling fz_new_draw_device. Free the device by calling + fz_drop_device. + + transform: Transform from user space in points to device space + in pixels. + + clip: Bounding box to restrict any marking operations of the + draw device. + + proof_cs: Color space to render to prior to mapping to color + space defined by pixmap. +*/ +fz_device *fz_new_draw_device_with_bbox_proof(fz_context *ctx, fz_matrix transform, fz_pixmap *dest, const fz_irect *clip, fz_colorspace *cs); + +fz_device *fz_new_draw_device_type3(fz_context *ctx, fz_matrix transform, fz_pixmap *dest); + +/** + struct fz_draw_options: Options for creating a pixmap and draw + device. +*/ +typedef struct +{ + int rotate; + int x_resolution; + int y_resolution; + int width; + int height; + fz_colorspace *colorspace; + int alpha; + int graphics; + int text; +} fz_draw_options; + +FZ_DATA extern const char *fz_draw_options_usage; + +/** + Parse draw device options from a comma separated key-value string. +*/ +fz_draw_options *fz_parse_draw_options(fz_context *ctx, fz_draw_options *options, const char *string); + +/** + Create a new pixmap and draw device, using the specified options. + + options: Options to configure the draw device, and choose the + resolution and colorspace. + + mediabox: The bounds of the page in points. + + pixmap: An out parameter containing the newly created pixmap. +*/ +fz_device *fz_new_draw_device_with_options(fz_context *ctx, const fz_draw_options *options, fz_rect mediabox, fz_pixmap **pixmap); + +#endif diff --git a/include/mupdf/fitz/display-list.h b/include/mupdf/fitz/display-list.h new file mode 100644 index 0000000..957efe4 --- /dev/null +++ b/include/mupdf/fitz/display-list.h @@ -0,0 +1,142 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_DISPLAY_LIST_H +#define MUPDF_FITZ_DISPLAY_LIST_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/geometry.h" +#include "mupdf/fitz/device.h" + +/** + Display list device -- record and play back device commands. +*/ + +/** + fz_display_list is a list containing drawing commands (text, + images, etc.). The intent is two-fold: as a caching-mechanism + to reduce parsing of a page, and to be used as a data + structure in multi-threading where one thread parses the page + and another renders pages. + + Create a display list with fz_new_display_list, hand it over to + fz_new_list_device to have it populated, and later replay the + list (once or many times) by calling fz_run_display_list. When + the list is no longer needed drop it with fz_drop_display_list. +*/ +typedef struct fz_display_list fz_display_list; + +/** + Create an empty display list. + + A display list contains drawing commands (text, images, etc.). + Use fz_new_list_device for populating the list. + + mediabox: Bounds of the page (in points) represented by the + display list. +*/ +fz_display_list *fz_new_display_list(fz_context *ctx, fz_rect mediabox); + +/** + Create a rendering device for a display list. + + When the device is rendering a page it will populate the + display list with drawing commands (text, images, etc.). The + display list can later be reused to render a page many times + without having to re-interpret the page from the document file + for each rendering. Once the device is no longer needed, free + it with fz_drop_device. + + list: A display list that the list device takes a reference to. +*/ +fz_device *fz_new_list_device(fz_context *ctx, fz_display_list *list); + +/** + (Re)-run a display list through a device. + + list: A display list, created by fz_new_display_list and + populated with objects from a page by running fz_run_page on a + device obtained from fz_new_list_device. + + ctm: Transform to apply to display list contents. May include + for example scaling and rotation, see fz_scale, fz_rotate and + fz_concat. Set to fz_identity if no transformation is desired. + + scissor: Only the part of the contents of the display list + visible within this area will be considered when the list is + run through the device. This does not imply for tile objects + contained in the display list. + + cookie: Communication mechanism between caller and library + running the page. Intended for multi-threaded applications, + while single-threaded applications set cookie to NULL. The + caller may abort an ongoing page run. Cookie also communicates + progress information back to the caller. The fields inside + cookie are continually updated while the page is being run. +*/ +void fz_run_display_list(fz_context *ctx, fz_display_list *list, fz_device *dev, fz_matrix ctm, fz_rect scissor, fz_cookie *cookie); + +/** + Increment the reference count for a display list. Returns the + same pointer. + + Never throws exceptions. +*/ +fz_display_list *fz_keep_display_list(fz_context *ctx, fz_display_list *list); + +/** + Decrement the reference count for a display list. When the + reference count reaches zero, all the references in the display + list itself are dropped, and the display list is freed. + + Never throws exceptions. +*/ +void fz_drop_display_list(fz_context *ctx, fz_display_list *list); + +/** + Return the bounding box of the page recorded in a display list. +*/ +fz_rect fz_bound_display_list(fz_context *ctx, fz_display_list *list); + +/** + Create a new image from a display list. + + w, h: The conceptual width/height of the image. + + transform: The matrix that needs to be applied to the given + list to make it render to the unit square. + + list: The display list. +*/ +fz_image *fz_new_image_from_display_list(fz_context *ctx, float w, float h, fz_display_list *list); + +/** + Check for a display list being empty + + list: The list to check. + + Returns true if empty, false otherwise. +*/ +int fz_display_list_is_empty(fz_context *ctx, const fz_display_list *list); + +#endif diff --git a/include/mupdf/fitz/document.h b/include/mupdf/fitz/document.h new file mode 100644 index 0000000..b1e6dfa --- /dev/null +++ b/include/mupdf/fitz/document.h @@ -0,0 +1,1005 @@ +// Copyright (C) 2004-2023 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_DOCUMENT_H +#define MUPDF_FITZ_DOCUMENT_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/types.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/geometry.h" +#include "mupdf/fitz/device.h" +#include "mupdf/fitz/transition.h" +#include "mupdf/fitz/link.h" +#include "mupdf/fitz/outline.h" +#include "mupdf/fitz/separation.h" + +typedef struct fz_document_handler fz_document_handler; +typedef struct fz_page fz_page; +typedef intptr_t fz_bookmark; + +typedef enum +{ + FZ_MEDIA_BOX, + FZ_CROP_BOX, + FZ_BLEED_BOX, + FZ_TRIM_BOX, + FZ_ART_BOX, + FZ_UNKNOWN_BOX +} fz_box_type; + +fz_box_type fz_box_type_from_string(const char *name); +const char *fz_string_from_box_type(fz_box_type box); + +/** + Simple constructor for fz_locations. +*/ +static inline fz_location fz_make_location(int chapter, int page) +{ + fz_location loc = { chapter, page }; + return loc; +} + +enum +{ + /* 6in at 4:3 */ + FZ_LAYOUT_KINDLE_W = 260, + FZ_LAYOUT_KINDLE_H = 346, + FZ_LAYOUT_KINDLE_EM = 9, + + /* 4.25 x 6.87 in */ + FZ_LAYOUT_US_POCKET_W = 306, + FZ_LAYOUT_US_POCKET_H = 495, + FZ_LAYOUT_US_POCKET_EM = 10, + + /* 5.5 x 8.5 in */ + FZ_LAYOUT_US_TRADE_W = 396, + FZ_LAYOUT_US_TRADE_H = 612, + FZ_LAYOUT_US_TRADE_EM = 11, + + /* 110 x 178 mm */ + FZ_LAYOUT_UK_A_FORMAT_W = 312, + FZ_LAYOUT_UK_A_FORMAT_H = 504, + FZ_LAYOUT_UK_A_FORMAT_EM = 10, + + /* 129 x 198 mm */ + FZ_LAYOUT_UK_B_FORMAT_W = 366, + FZ_LAYOUT_UK_B_FORMAT_H = 561, + FZ_LAYOUT_UK_B_FORMAT_EM = 10, + + /* 135 x 216 mm */ + FZ_LAYOUT_UK_C_FORMAT_W = 382, + FZ_LAYOUT_UK_C_FORMAT_H = 612, + FZ_LAYOUT_UK_C_FORMAT_EM = 11, + + /* 148 x 210 mm */ + FZ_LAYOUT_A5_W = 420, + FZ_LAYOUT_A5_H = 595, + FZ_LAYOUT_A5_EM = 11, + + /* Default to A5 */ + FZ_DEFAULT_LAYOUT_W = FZ_LAYOUT_A5_W, + FZ_DEFAULT_LAYOUT_H = FZ_LAYOUT_A5_H, + FZ_DEFAULT_LAYOUT_EM = FZ_LAYOUT_A5_EM, +}; + +typedef enum +{ + FZ_PERMISSION_PRINT = 'p', + FZ_PERMISSION_COPY = 'c', + FZ_PERMISSION_EDIT = 'e', + FZ_PERMISSION_ANNOTATE = 'n', + FZ_PERMISSION_FORM = 'f', + FZ_PERMISSION_ACCESSIBILITY = 'y', + FZ_PERMISSION_ASSEMBLE = 'a', + FZ_PERMISSION_PRINT_HQ = 'h', +} +fz_permission; + +/** + Type for a function to be called when + the reference count for the fz_document drops to 0. The + implementation should release any resources held by the + document. The actual document pointer will be freed by the + caller. +*/ +typedef void (fz_document_drop_fn)(fz_context *ctx, fz_document *doc); + +/** + Type for a function to be + called to enquire whether the document needs a password + or not. See fz_needs_password for more information. +*/ +typedef int (fz_document_needs_password_fn)(fz_context *ctx, fz_document *doc); + +/** + Type for a function to be + called to attempt to authenticate a password. See + fz_authenticate_password for more information. +*/ +typedef int (fz_document_authenticate_password_fn)(fz_context *ctx, fz_document *doc, const char *password); + +/** + Type for a function to be + called to see if a document grants a certain permission. See + fz_document_has_permission for more information. +*/ +typedef int (fz_document_has_permission_fn)(fz_context *ctx, fz_document *doc, fz_permission permission); + +/** + Type for a function to be called to + load the outlines for a document. See fz_document_load_outline + for more information. +*/ +typedef fz_outline *(fz_document_load_outline_fn)(fz_context *ctx, fz_document *doc); + +/** + Type for a function to be called to obtain an outline iterator + for a document. See fz_document_outline_iterator for more information. +*/ +typedef fz_outline_iterator *(fz_document_outline_iterator_fn)(fz_context *ctx, fz_document *doc); + +/** + Type for a function to be called to lay + out a document. See fz_layout_document for more information. +*/ +typedef void (fz_document_layout_fn)(fz_context *ctx, fz_document *doc, float w, float h, float em); + +/** + Type for a function to be called to + resolve an internal link to a location (chapter/page number + tuple). See fz_resolve_link_dest for more information. +*/ +typedef fz_link_dest (fz_document_resolve_link_dest_fn)(fz_context *ctx, fz_document *doc, const char *uri); + +/** + Type for a function to be called to + create an internal link to a destination (chapter/page/x/y/w/h/zoom/type + tuple). See fz_resolve_link_dest for more information. +*/ +typedef char * (fz_document_format_link_uri_fn)(fz_context *ctx, fz_document *doc, fz_link_dest dest); + +/** + Type for a function to be called to + count the number of chapters in a document. See + fz_count_chapters for more information. +*/ +typedef int (fz_document_count_chapters_fn)(fz_context *ctx, fz_document *doc); + +/** + Type for a function to be called to + count the number of pages in a document. See fz_count_pages for + more information. +*/ +typedef int (fz_document_count_pages_fn)(fz_context *ctx, fz_document *doc, int chapter); + +/** + Type for a function to load a given + page from a document. See fz_load_page for more information. +*/ +typedef fz_page *(fz_document_load_page_fn)(fz_context *ctx, fz_document *doc, int chapter, int page); + +/** + Type for a function to get the page label of a page in the document. + See fz_page_label for more information. +*/ +typedef void (fz_document_page_label_fn)(fz_context *ctx, fz_document *doc, int chapter, int page, char *buf, size_t size); + +/** + Type for a function to query + a document's metadata. See fz_lookup_metadata for more + information. +*/ +typedef int (fz_document_lookup_metadata_fn)(fz_context *ctx, fz_document *doc, const char *key, char *buf, size_t size); + +/** + Type for a function to set + a document's metadata. See fz_set_metadata for more + information. +*/ +typedef int (fz_document_set_metadata_fn)(fz_context *ctx, fz_document *doc, const char *key, const char *value); + +/** + Return output intent color space if it exists +*/ +typedef fz_colorspace *(fz_document_output_intent_fn)(fz_context *ctx, fz_document *doc); + +/** + Write document accelerator data +*/ +typedef void (fz_document_output_accelerator_fn)(fz_context *ctx, fz_document *doc, fz_output *out); + +/** + Type for a function to make + a bookmark. See fz_make_bookmark for more information. +*/ +typedef fz_bookmark (fz_document_make_bookmark_fn)(fz_context *ctx, fz_document *doc, fz_location loc); + +/** + Type for a function to lookup a bookmark. + See fz_lookup_bookmark for more information. +*/ +typedef fz_location (fz_document_lookup_bookmark_fn)(fz_context *ctx, fz_document *doc, fz_bookmark mark); + +/** + Type for a function to release all the + resources held by a page. Called automatically when the + reference count for that page reaches zero. +*/ +typedef void (fz_page_drop_page_fn)(fz_context *ctx, fz_page *page); + +/** + Type for a function to return the + bounding box of a page. See fz_bound_page for more + information. +*/ +typedef fz_rect (fz_page_bound_page_fn)(fz_context *ctx, fz_page *page, fz_box_type box); + +/** + Type for a function to run the + contents of a page. See fz_run_page_contents for more + information. +*/ +typedef void (fz_page_run_page_fn)(fz_context *ctx, fz_page *page, fz_device *dev, fz_matrix transform, fz_cookie *cookie); + +/** + Type for a function to load the links + from a page. See fz_load_links for more information. +*/ +typedef fz_link *(fz_page_load_links_fn)(fz_context *ctx, fz_page *page); + +/** + Type for a function to + obtain the details of how this page should be presented when + in presentation mode. See fz_page_presentation for more + information. +*/ +typedef fz_transition *(fz_page_page_presentation_fn)(fz_context *ctx, fz_page *page, fz_transition *transition, float *duration); + +/** + Type for a function to enable/ + disable separations on a page. See fz_control_separation for + more information. +*/ +typedef void (fz_page_control_separation_fn)(fz_context *ctx, fz_page *page, int separation, int disable); + +/** + Type for a function to detect + whether a given separation is enabled or disabled on a page. + See FZ_SEPARATION_DISABLED for more information. +*/ +typedef int (fz_page_separation_disabled_fn)(fz_context *ctx, fz_page *page, int separation); + +/** + Type for a function to retrieve + details of separations on a page. See fz_get_separations + for more information. +*/ +typedef fz_separations *(fz_page_separations_fn)(fz_context *ctx, fz_page *page); + +/** + Type for a function to retrieve + whether or not a given page uses overprint. +*/ +typedef int (fz_page_uses_overprint_fn)(fz_context *ctx, fz_page *page); + + +/** + Type for a function to create a link on a page. +*/ +typedef fz_link *(fz_page_create_link_fn)(fz_context *ctx, fz_page *page, fz_rect bbox, const char *uri); + +/** + Type for a function to delete a link on a page. +*/ +typedef void (fz_page_delete_link_fn)(fz_context *ctx, fz_page *page, fz_link *link); + +/** + Function type to open a document from a file. + + filename: file to open + + Pointer to opened document. Throws exception in case of error. +*/ +typedef fz_document *(fz_document_open_fn)(fz_context *ctx, const char *filename); + +/** + Function type to open a + document from a file. + + stream: fz_stream to read document data from. Must be + seekable for formats that require it. + + Pointer to opened document. Throws exception in case of error. +*/ +typedef fz_document *(fz_document_open_with_stream_fn)(fz_context *ctx, fz_stream *stream); + +/** + Function type to open a document from a + file, with accelerator data. + + filename: file to open + + accel: accelerator file + + Pointer to opened document. Throws exception in case of error. +*/ +typedef fz_document *(fz_document_open_accel_fn)(fz_context *ctx, const char *filename, const char *accel); + +/** + Function type to open a document from a file, + with accelerator data. + + stream: fz_stream to read document data from. Must be + seekable for formats that require it. + + accel: fz_stream to read accelerator data from. Must be + seekable for formats that require it. + + Pointer to opened document. Throws exception in case of error. +*/ +typedef fz_document *(fz_document_open_accel_with_stream_fn)(fz_context *ctx, fz_stream *stream, fz_stream *accel); + +/** + Recognize a document type from + a magic string. + + magic: string to recognise - typically a filename or mime + type. + + Returns a number between 0 (not recognized) and 100 + (fully recognized) based on how certain the recognizer + is that this is of the required type. +*/ +typedef int (fz_document_recognize_fn)(fz_context *ctx, const char *magic); + +/** + Recognize a document type from stream contents. + + stream: stream contents to recognise. + + Returns a number between 0 (not recognized) and 100 + (fully recognized) based on how certain the recognizer + is that this is of the required type. +*/ +typedef int (fz_document_recognize_content_fn)(fz_context *ctx, fz_stream *stream); + +/** + Type for a function to be called when processing an already opened page. + See fz_process_opened_pages. +*/ +typedef void *(fz_process_opened_page_fn)(fz_context *ctx, fz_page *page, void *state); + +/** + Register a handler for a document type. + + handler: The handler to register. +*/ +void fz_register_document_handler(fz_context *ctx, const fz_document_handler *handler); + +/** + Register handlers + for all the standard document types supported in + this build. +*/ +void fz_register_document_handlers(fz_context *ctx); + +/** + Given a magic find a document handler that can handle a + document of this type. + + magic: Can be a filename extension (including initial period) or + a mimetype. +*/ +const fz_document_handler *fz_recognize_document(fz_context *ctx, const char *magic); + +/** + Given a filename find a document handler that can handle a + document of this type. + + filename: The filename of the document. This will be opened and sampled + to check data. +*/ +const fz_document_handler *fz_recognize_document_content(fz_context *ctx, const char *filename); + +/** + Given a magic find a document handler that can handle a + document of this type. + + stream: the file stream to sample. + + magic: Can be a filename extension (including initial period) or + a mimetype. +*/ +const fz_document_handler *fz_recognize_document_stream_content(fz_context *ctx, fz_stream *stream, const char *magic); + +/** + Open a document file and read its basic structure so pages and + objects can be located. MuPDF will try to repair broken + documents (without actually changing the file contents). + + The returned fz_document is used when calling most other + document related functions. + + filename: a path to a file as it would be given to open(2). +*/ +fz_document *fz_open_document(fz_context *ctx, const char *filename); + +/** + Open a document file and read its basic structure so pages and + objects can be located. MuPDF will try to repair broken + documents (without actually changing the file contents). + + The returned fz_document is used when calling most other + document related functions. + + filename: a path to a file as it would be given to open(2). +*/ +fz_document *fz_open_accelerated_document(fz_context *ctx, const char *filename, const char *accel); + +/** + Open a document using the specified stream object rather than + opening a file on disk. + + magic: a string used to detect document type; either a file name + or mime-type. +*/ +fz_document *fz_open_document_with_stream(fz_context *ctx, const char *magic, fz_stream *stream); + +/** + Open a document using a buffer rather than opening a file on disk. +*/ +fz_document *fz_open_document_with_buffer(fz_context *ctx, const char *magic, fz_buffer *buffer); + +/** + Open a document using the specified stream object rather than + opening a file on disk. + + magic: a string used to detect document type; either a file name + or mime-type. +*/ +fz_document *fz_open_accelerated_document_with_stream(fz_context *ctx, const char *magic, fz_stream *stream, fz_stream *accel); + +/** + Query if the document supports the saving of accelerator data. +*/ +int fz_document_supports_accelerator(fz_context *ctx, fz_document *doc); + +/** + Save accelerator data for the document to a given file. +*/ +void fz_save_accelerator(fz_context *ctx, fz_document *doc, const char *accel); + +/** + Output accelerator data for the document to a given output + stream. +*/ +void fz_output_accelerator(fz_context *ctx, fz_document *doc, fz_output *accel); + +/** + New documents are typically created by calls like + foo_new_document(fz_context *ctx, ...). These work by + deriving a new document type from fz_document, for instance: + typedef struct { fz_document base; ...extras... } foo_document; + These are allocated by calling + fz_new_derived_document(ctx, foo_document) +*/ +void *fz_new_document_of_size(fz_context *ctx, int size); +#define fz_new_derived_document(C,M) ((M*)Memento_label(fz_new_document_of_size(C, sizeof(M)), #M)) + +/** + Increment the document reference count. The same pointer is + returned. + + Never throws exceptions. +*/ +fz_document *fz_keep_document(fz_context *ctx, fz_document *doc); + +/** + Decrement the document reference count. When the reference + count reaches 0, the document and all it's references are + freed. + + Never throws exceptions. +*/ +void fz_drop_document(fz_context *ctx, fz_document *doc); + +/** + Check if a document is encrypted with a + non-blank password. +*/ +int fz_needs_password(fz_context *ctx, fz_document *doc); + +/** + Test if the given password can decrypt the document. + + password: The password string to be checked. Some document + specifications do not specify any particular text encoding, so + neither do we. + + Returns 0 for failure to authenticate, non-zero for success. + + For PDF documents, further information can be given by examining + the bits in the return code. + + Bit 0 => No password required + Bit 1 => User password authenticated + Bit 2 => Owner password authenticated +*/ +int fz_authenticate_password(fz_context *ctx, fz_document *doc, const char *password); + +/** + Load the hierarchical document outline. + + Should be freed by fz_drop_outline. +*/ +fz_outline *fz_load_outline(fz_context *ctx, fz_document *doc); + +/** + Get an iterator for the document outline. + + Should be freed by fz_drop_outline_iterator. +*/ +fz_outline_iterator *fz_new_outline_iterator(fz_context *ctx, fz_document *doc); + +/** + Is the document reflowable. + + Returns 1 to indicate reflowable documents, otherwise 0. +*/ +int fz_is_document_reflowable(fz_context *ctx, fz_document *doc); + +/** + Layout reflowable document types. + + w, h: Page size in points. + em: Default font size in points. +*/ +void fz_layout_document(fz_context *ctx, fz_document *doc, float w, float h, float em); + +/** + Create a bookmark for the given page, which can be used to find + the same location after the document has been laid out with + different parameters. +*/ +fz_bookmark fz_make_bookmark(fz_context *ctx, fz_document *doc, fz_location loc); + +/** + Find a bookmark and return its page number. +*/ +fz_location fz_lookup_bookmark(fz_context *ctx, fz_document *doc, fz_bookmark mark); + +/** + Return the number of pages in document + + May return 0 for documents with no pages. +*/ +int fz_count_pages(fz_context *ctx, fz_document *doc); + +/** + Resolve an internal link to a page number, location, and possible viewing parameters. + + Returns location (-1,-1) if the URI cannot be resolved. +*/ +fz_link_dest fz_resolve_link_dest(fz_context *ctx, fz_document *doc, const char *uri); + +/** + Format an internal link to a page number, location, and possible viewing parameters, + suitable for use with fz_create_link. + + Returns a newly allocated string that the caller must free. +*/ +char *fz_format_link_uri(fz_context *ctx, fz_document *doc, fz_link_dest dest); + +/** + Resolve an internal link to a page number. + + xp, yp: Pointer to store coordinate of destination on the page. + + Returns (-1,-1) if the URI cannot be resolved. +*/ +fz_location fz_resolve_link(fz_context *ctx, fz_document *doc, const char *uri, float *xp, float *yp); + +/** + Function to get the location for the last page in the document. + Using this can be far more efficient in some cases than calling + fz_count_pages and using the page number. +*/ +fz_location fz_last_page(fz_context *ctx, fz_document *doc); + +/** + Function to get the location of the next page (allowing for the + end of chapters etc). If at the end of the document, returns the + current location. +*/ +fz_location fz_next_page(fz_context *ctx, fz_document *doc, fz_location loc); + +/** + Function to get the location of the previous page (allowing for + the end of chapters etc). If already at the start of the + document, returns the current page. +*/ +fz_location fz_previous_page(fz_context *ctx, fz_document *doc, fz_location loc); + +/** + Clamps a location into valid chapter/page range. (First clamps + the chapter into range, then the page into range). +*/ +fz_location fz_clamp_location(fz_context *ctx, fz_document *doc, fz_location loc); + +/** + Converts from page number to chapter+page. This may cause many + chapters to be laid out in order to calculate the number of + pages within those chapters. +*/ +fz_location fz_location_from_page_number(fz_context *ctx, fz_document *doc, int number); + +/** + Converts from chapter+page to page number. This may cause many + chapters to be laid out in order to calculate the number of + pages within those chapters. +*/ +int fz_page_number_from_location(fz_context *ctx, fz_document *doc, fz_location loc); + +/** + Load a given page number from a document. This may be much less + efficient than loading by location (chapter+page) for some + document types. +*/ +fz_page *fz_load_page(fz_context *ctx, fz_document *doc, int number); + +/** + Return the number of chapters in the document. + At least 1. +*/ +int fz_count_chapters(fz_context *ctx, fz_document *doc); + +/** + Return the number of pages in a chapter. + May return 0. +*/ +int fz_count_chapter_pages(fz_context *ctx, fz_document *doc, int chapter); + +/** + Load a page. + + After fz_load_page is it possible to retrieve the size of the + page using fz_bound_page, or to render the page using + fz_run_page_*. Free the page by calling fz_drop_page. + + chapter: chapter number, 0 is the first chapter of the document. + number: page number, 0 is the first page of the chapter. +*/ +fz_page *fz_load_chapter_page(fz_context *ctx, fz_document *doc, int chapter, int page); + +/** + Load the list of links for a page. + + Returns a linked list of all the links on the page, each with + its clickable region and link destination. Each link is + reference counted so drop and free the list of links by + calling fz_drop_link on the pointer return from fz_load_links. + + page: Page obtained from fz_load_page. +*/ +fz_link *fz_load_links(fz_context *ctx, fz_page *page); + +/** + Different document types will be implemented by deriving from + fz_page. This macro allocates such derived structures, and + initialises the base sections. +*/ +fz_page *fz_new_page_of_size(fz_context *ctx, int size, fz_document *doc); +#define fz_new_derived_page(CTX,TYPE,DOC) \ + ((TYPE *)Memento_label(fz_new_page_of_size(CTX,sizeof(TYPE),DOC),#TYPE)) + +/** + Determine the size of a page at 72 dpi. +*/ +fz_rect fz_bound_page(fz_context *ctx, fz_page *page); +fz_rect fz_bound_page_box(fz_context *ctx, fz_page *page, fz_box_type box); + +/** + Run a page through a device. + + page: Page obtained from fz_load_page. + + dev: Device obtained from fz_new_*_device. + + transform: Transform to apply to page. May include for example + scaling and rotation, see fz_scale, fz_rotate and fz_concat. + Set to fz_identity if no transformation is desired. + + cookie: Communication mechanism between caller and library + rendering the page. Intended for multi-threaded applications, + while single-threaded applications set cookie to NULL. The + caller may abort an ongoing rendering of a page. Cookie also + communicates progress information back to the caller. The + fields inside cookie are continually updated while the page is + rendering. +*/ +void fz_run_page(fz_context *ctx, fz_page *page, fz_device *dev, fz_matrix transform, fz_cookie *cookie); + +/** + Run a page through a device. Just the main + page content, without the annotations, if any. + + page: Page obtained from fz_load_page. + + dev: Device obtained from fz_new_*_device. + + transform: Transform to apply to page. May include for example + scaling and rotation, see fz_scale, fz_rotate and fz_concat. + Set to fz_identity if no transformation is desired. + + cookie: Communication mechanism between caller and library + rendering the page. Intended for multi-threaded applications, + while single-threaded applications set cookie to NULL. The + caller may abort an ongoing rendering of a page. Cookie also + communicates progress information back to the caller. The + fields inside cookie are continually updated while the page is + rendering. +*/ +void fz_run_page_contents(fz_context *ctx, fz_page *page, fz_device *dev, fz_matrix transform, fz_cookie *cookie); + +/** + Run the annotations on a page through a device. +*/ +void fz_run_page_annots(fz_context *ctx, fz_page *page, fz_device *dev, fz_matrix transform, fz_cookie *cookie); + +/** + Run the widgets on a page through a device. +*/ +void fz_run_page_widgets(fz_context *ctx, fz_page *page, fz_device *dev, fz_matrix transform, fz_cookie *cookie); + +/** + Increment the reference count for the page. Returns the same + pointer. + + Never throws exceptions. +*/ +fz_page *fz_keep_page(fz_context *ctx, fz_page *page); + +/** + Increment the reference count for the page. Returns the same + pointer. Must only be used when the alloc lock is already taken. + + Never throws exceptions. +*/ +fz_page *fz_keep_page_locked(fz_context *ctx, fz_page *page); + +/** + Decrements the reference count for the page. When the reference + count hits 0, the page and its references are freed. + + Never throws exceptions. +*/ +void fz_drop_page(fz_context *ctx, fz_page *page); + +/** + Get the presentation details for a given page. + + transition: A pointer to a transition struct to fill out. + + duration: A pointer to a place to set the page duration in + seconds. Will be set to 0 if no transition is specified for the + page. + + Returns: a pointer to the transition structure, or NULL if there + is no transition specified for the page. +*/ +fz_transition *fz_page_presentation(fz_context *ctx, fz_page *page, fz_transition *transition, float *duration); + +/** + Get page label for a given page. +*/ +const char *fz_page_label(fz_context *ctx, fz_page *page, char *buf, int size); + +/** + Check permission flags on document. +*/ +int fz_has_permission(fz_context *ctx, fz_document *doc, fz_permission p); + +/** + Retrieve document meta data strings. + + doc: The document to query. + + key: Which meta data key to retrieve... + + Basic information: + 'format' -- Document format and version. + 'encryption' -- Description of the encryption used. + + From the document information dictionary: + 'info:Title' + 'info:Author' + 'info:Subject' + 'info:Keywords' + 'info:Creator' + 'info:Producer' + 'info:CreationDate' + 'info:ModDate' + + buf: The buffer to hold the results (a nul-terminated UTF-8 + string). + + size: Size of 'buf'. + + Returns the number of bytes need to store the string plus terminator + (will be larger than 'size' if the output was truncated), or -1 if the + key is not recognized or found. +*/ +int fz_lookup_metadata(fz_context *ctx, fz_document *doc, const char *key, char *buf, int size); + +#define FZ_META_FORMAT "format" +#define FZ_META_ENCRYPTION "encryption" + +#define FZ_META_INFO "info:" +#define FZ_META_INFO_TITLE "info:Title" +#define FZ_META_INFO_AUTHOR "info:Author" +#define FZ_META_INFO_SUBJECT "info:Subject" +#define FZ_META_INFO_KEYWORDS "info:Keywords" +#define FZ_META_INFO_CREATOR "info:Creator" +#define FZ_META_INFO_PRODUCER "info:Producer" +#define FZ_META_INFO_CREATIONDATE "info:CreationDate" +#define FZ_META_INFO_MODIFICATIONDATE "info:ModDate" + +void fz_set_metadata(fz_context *ctx, fz_document *doc, const char *key, const char *value); + +/** + Find the output intent colorspace if the document has defined + one. + + Returns a borrowed reference that should not be dropped, unless + it is kept first. +*/ +fz_colorspace *fz_document_output_intent(fz_context *ctx, fz_document *doc); + +/** + Get the separations details for a page. + This will be NULL, unless the format specifically supports + separations (such as PDF files). May be NULL even + so, if there are no separations on a page. + + Returns a reference that must be dropped. +*/ +fz_separations *fz_page_separations(fz_context *ctx, fz_page *page); + +/** + Query if a given page requires overprint. +*/ +int fz_page_uses_overprint(fz_context *ctx, fz_page *page); + +/** + Create a new link on a page. +*/ +fz_link *fz_create_link(fz_context *ctx, fz_page *page, fz_rect bbox, const char *uri); + +/** + Delete an existing link on a page. +*/ +void fz_delete_link(fz_context *ctx, fz_page *page, fz_link *link); + +/** + Iterates over all opened pages of the document, calling the + provided callback for each page for processing. If the callback + returns non-NULL then the iteration stops and that value is returned + to the called of fz_process_opened_pages(). + + The state pointer provided to fz_process_opened_pages() is + passed on to the callback but is owned by the caller. + + Returns the first non-NULL value returned by the callback, + or NULL if the callback returned NULL for all opened pages. +*/ +void *fz_process_opened_pages(fz_context *ctx, fz_document *doc, fz_process_opened_page_fn *process_openend_page, void *state); + +/* Implementation details: subject to change. */ + +/** + Structure definition is public so other classes can + derive from it. Do not access the members directly. +*/ +struct fz_page +{ + int refs; + fz_document *doc; /* kept reference to parent document. Guaranteed non-NULL. */ + int chapter; /* chapter number */ + int number; /* page number in chapter */ + int incomplete; /* incomplete from progressive loading; don't cache! */ + fz_page_drop_page_fn *drop_page; + fz_page_bound_page_fn *bound_page; + fz_page_run_page_fn *run_page_contents; + fz_page_run_page_fn *run_page_annots; + fz_page_run_page_fn *run_page_widgets; + fz_page_load_links_fn *load_links; + fz_page_page_presentation_fn *page_presentation; + fz_page_control_separation_fn *control_separation; + fz_page_separation_disabled_fn *separation_disabled; + fz_page_separations_fn *separations; + fz_page_uses_overprint_fn *overprint; + fz_page_create_link_fn *create_link; + fz_page_delete_link_fn *delete_link; + + /* linked list of currently open pages. This list is maintained + * by fz_load_chapter_page and fz_drop_page. All pages hold a + * kept reference to the document, so the document cannot disappear + * while pages exist. 'Incomplete' pages are NOT kept in this + * list. */ + fz_page **prev, *next; +}; + +/** + Structure definition is public so other classes can + derive from it. Callers should not access the members + directly, though implementations will need initialize + functions directly. +*/ +struct fz_document +{ + int refs; + fz_document_drop_fn *drop_document; + fz_document_needs_password_fn *needs_password; + fz_document_authenticate_password_fn *authenticate_password; + fz_document_has_permission_fn *has_permission; + fz_document_load_outline_fn *load_outline; + fz_document_outline_iterator_fn *outline_iterator; + fz_document_layout_fn *layout; + fz_document_make_bookmark_fn *make_bookmark; + fz_document_lookup_bookmark_fn *lookup_bookmark; + fz_document_resolve_link_dest_fn *resolve_link_dest; + fz_document_format_link_uri_fn *format_link_uri; + fz_document_count_chapters_fn *count_chapters; + fz_document_count_pages_fn *count_pages; + fz_document_load_page_fn *load_page; + fz_document_page_label_fn *page_label; + fz_document_lookup_metadata_fn *lookup_metadata; + fz_document_set_metadata_fn *set_metadata; + fz_document_output_intent_fn *get_output_intent; + fz_document_output_accelerator_fn *output_accelerator; + int did_layout; + int is_reflowable; + + /* Linked list of currently open pages. These are not + * references, but just a linked list of open pages, + * maintained by fz_load_chapter_page, and fz_drop_page. + * Every page holds a kept reference to the document, so + * the document cannot be destroyed while a page exists. + * Incomplete pages are NOT inserted into this list, but + * do still hold a real document reference. */ + fz_page *open; +}; + +struct fz_document_handler +{ + fz_document_recognize_fn *recognize; + fz_document_open_fn *open; + fz_document_open_with_stream_fn *open_with_stream; + const char **extensions; + const char **mimetypes; + fz_document_open_accel_fn *open_accel; + fz_document_open_accel_with_stream_fn *open_accel_with_stream; + fz_document_recognize_content_fn *recognize_content; +}; + +#endif diff --git a/include/mupdf/fitz/export.h b/include/mupdf/fitz/export.h new file mode 100644 index 0000000..853e2d5 --- /dev/null +++ b/include/mupdf/fitz/export.h @@ -0,0 +1,52 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_EXPORT_H +#define MUPDF_FITZ_EXPORT_H + +/* + * Support for building/using MuPDF DLL on Windows. + * + * When compiling code that uses MuPDF DLL, FZ_DLL_CLIENT should be defined. + * + * When compiling MuPDF DLL itself, FZ_DLL should be defined. + */ + +#if defined(_WIN32) || defined(_WIN64) + #if defined(FZ_DLL) + /* Building DLL. */ + #define FZ_FUNCTION __declspec(dllexport) + #define FZ_DATA __declspec(dllexport) + #elif defined(FZ_DLL_CLIENT) + /* Building DLL client code. */ + #define FZ_FUNCTION __declspec(dllexport) + #define FZ_DATA __declspec(dllimport) + #else + #define FZ_FUNCTION + #define FZ_DATA + #endif +#else + #define FZ_FUNCTION + #define FZ_DATA +#endif + +#endif diff --git a/include/mupdf/fitz/filter.h b/include/mupdf/fitz/filter.h new file mode 100644 index 0000000..2926b56 --- /dev/null +++ b/include/mupdf/fitz/filter.h @@ -0,0 +1,251 @@ +// Copyright (C) 2004-2023 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_FILTER_H +#define MUPDF_FITZ_FILTER_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/buffer.h" +#include "mupdf/fitz/store.h" +#include "mupdf/fitz/stream.h" + +typedef struct fz_jbig2_globals fz_jbig2_globals; + +typedef struct +{ + int64_t offset; + uint64_t length; +} fz_range; + +/** + The null filter reads a specified amount of data from the + substream. +*/ +fz_stream *fz_open_null_filter(fz_context *ctx, fz_stream *chain, uint64_t len, int64_t offset); + +/** + The range filter copies data from specified ranges of the + chained stream. +*/ +fz_stream *fz_open_range_filter(fz_context *ctx, fz_stream *chain, fz_range *ranges, int nranges); + +/** + The endstream filter reads a PDF substream, and starts to look + for an 'endstream' token after the specified length. +*/ +fz_stream *fz_open_endstream_filter(fz_context *ctx, fz_stream *chain, uint64_t len, int64_t offset); + +/** + Concat filter concatenates several streams into one. +*/ +fz_stream *fz_open_concat(fz_context *ctx, int max, int pad); + +/** + Add a chained stream to the end of the concatenate filter. + + Ownership of chain is passed in. +*/ +void fz_concat_push_drop(fz_context *ctx, fz_stream *concat, fz_stream *chain); + +/** + arc4 filter performs RC4 decoding of data read from the chained + filter using the supplied key. +*/ +fz_stream *fz_open_arc4(fz_context *ctx, fz_stream *chain, unsigned char *key, unsigned keylen); + +/** + aesd filter performs AES decoding of data read from the chained + filter using the supplied key. +*/ +fz_stream *fz_open_aesd(fz_context *ctx, fz_stream *chain, unsigned char *key, unsigned keylen); + +/** + a85d filter performs ASCII 85 Decoding of data read + from the chained filter. +*/ +fz_stream *fz_open_a85d(fz_context *ctx, fz_stream *chain); + +/** + ahxd filter performs ASCII Hex decoding of data read + from the chained filter. +*/ +fz_stream *fz_open_ahxd(fz_context *ctx, fz_stream *chain); + +/** + rld filter performs Run Length Decoding of data read + from the chained filter. +*/ +fz_stream *fz_open_rld(fz_context *ctx, fz_stream *chain); + +/** + dctd filter performs DCT (JPEG) decoding of data read + from the chained filter. + + color_transform implements the PDF color_transform option; + use 0 to disable YUV-RGB / YCCK-CMYK transforms + use >0 to enable YUV-RGB / YCCK-CMYK transforms + use -1 (default) if not embedded in PDF + use -2 (default) if embedded in PDF + + For subsampling on decode, set l2factor to the log2 of the + reduction required (therefore 0 = full size decode). + + jpegtables is an optional stream from which the JPEG tables + can be read. Use NULL if not required. +*/ +fz_stream *fz_open_dctd(fz_context *ctx, fz_stream *chain, int color_transform, int l2factor, fz_stream *jpegtables); + +/** + faxd filter performs FAX decoding of data read from + the chained filter. + + k: see fax specification (fax default is 0). + + end_of_line: whether we expect end of line markers (fax default + is 0). + + encoded_byte_align: whether we align to bytes after each line + (fax default is 0). + + columns: how many columns in the image (fax default is 1728). + + rows: 0 for unspecified or the number of rows of data to expect. + + end_of_block: whether we expect end of block markers (fax + default is 1). + + black_is_1: determines the polarity of the image (fax default is + 0). +*/ +fz_stream *fz_open_faxd(fz_context *ctx, fz_stream *chain, + int k, int end_of_line, int encoded_byte_align, + int columns, int rows, int end_of_block, int black_is_1); + +/** + flated filter performs LZ77 decoding (inflating) of data read + from the chained filter. + + window_bits: How large a decompression window to use. Typically + 15. A negative number, -n, means to use n bits, but to expect + raw data with no header. +*/ +fz_stream *fz_open_flated(fz_context *ctx, fz_stream *chain, int window_bits); + +/** + lzwd filter performs LZW decoding of data read from the chained + filter. + + early_change: (Default 1) specifies whether to change codes 1 + bit early. + + min_bits: (Default 9) specifies the minimum number of bits to + use. + + reverse_bits: (Default 0) allows for compatibility with gif and + old style tiffs (1). + + old_tiff: (Default 0) allows for different handling of the clear + code, as found in old style tiffs. +*/ +fz_stream *fz_open_lzwd(fz_context *ctx, fz_stream *chain, int early_change, int min_bits, int reverse_bits, int old_tiff); + +/** + predict filter performs pixel prediction on data read from + the chained filter. + + predictor: 1 = copy, 2 = tiff, other = inline PNG predictor + + columns: width of image in pixels + + colors: number of components. + + bpc: bits per component (typically 8) +*/ +fz_stream *fz_open_predict(fz_context *ctx, fz_stream *chain, int predictor, int columns, int colors, int bpc); + +/** + Open a filter that performs jbig2 decompression on the chained + stream, using the optional globals record. +*/ +fz_stream *fz_open_jbig2d(fz_context *ctx, fz_stream *chain, fz_jbig2_globals *globals, int embedded); + +/** + Create a jbig2 globals record from a buffer. + + Immutable once created. +*/ +fz_jbig2_globals *fz_load_jbig2_globals(fz_context *ctx, fz_buffer *buf); + +/** + Increment the reference count for a jbig2 globals record. + + Never throws an exception. +*/ +fz_jbig2_globals *fz_keep_jbig2_globals(fz_context *ctx, fz_jbig2_globals *globals); + +/** + Decrement the reference count for a jbig2 globals record. + When the reference count hits zero, the record is freed. + + Never throws an exception. +*/ +void fz_drop_jbig2_globals(fz_context *ctx, fz_jbig2_globals *globals); + +/** + Special jbig2 globals drop function for use in implementing + store support. +*/ +void fz_drop_jbig2_globals_imp(fz_context *ctx, fz_storable *globals); + +/** + Return buffer containing jbig2 globals data stream. +*/ +fz_buffer * fz_jbig2_globals_data(fz_context *ctx, fz_jbig2_globals *globals); + +/* Extra filters for tiff */ + +/** + SGI Log 16bit (greyscale) decode from the chained filter. + Decodes lines of w pixels to 8bpp greyscale. +*/ +fz_stream *fz_open_sgilog16(fz_context *ctx, fz_stream *chain, int w); + +/** + SGI Log 24bit (LUV) decode from the chained filter. + Decodes lines of w pixels to 8bpc rgb. +*/ +fz_stream *fz_open_sgilog24(fz_context *ctx, fz_stream *chain, int w); + +/** + SGI Log 32bit (LUV) decode from the chained filter. + Decodes lines of w pixels to 8bpc rgb. +*/ +fz_stream *fz_open_sgilog32(fz_context *ctx, fz_stream *chain, int w); + +/** + 4bit greyscale Thunderscan decoding from the chained filter. + Decodes lines of w pixels to 8bpp greyscale. +*/ +fz_stream *fz_open_thunder(fz_context *ctx, fz_stream *chain, int w); + +#endif diff --git a/include/mupdf/fitz/font.h b/include/mupdf/fitz/font.h new file mode 100644 index 0000000..6480fde --- /dev/null +++ b/include/mupdf/fitz/font.h @@ -0,0 +1,745 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_FONT_H +#define MUPDF_FITZ_FONT_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/geometry.h" +#include "mupdf/fitz/buffer.h" +#include "mupdf/fitz/color.h" + +/* forward declaration for circular dependency */ +struct fz_device; + +/* Various font encoding tables and lookup functions */ + +FZ_DATA extern const char *fz_glyph_name_from_adobe_standard[256]; +FZ_DATA extern const char *fz_glyph_name_from_iso8859_7[256]; +FZ_DATA extern const char *fz_glyph_name_from_koi8u[256]; +FZ_DATA extern const char *fz_glyph_name_from_mac_expert[256]; +FZ_DATA extern const char *fz_glyph_name_from_mac_roman[256]; +FZ_DATA extern const char *fz_glyph_name_from_win_ansi[256]; +FZ_DATA extern const char *fz_glyph_name_from_windows_1252[256]; + +FZ_DATA extern const unsigned short fz_unicode_from_iso8859_1[256]; +FZ_DATA extern const unsigned short fz_unicode_from_iso8859_7[256]; +FZ_DATA extern const unsigned short fz_unicode_from_koi8u[256]; +FZ_DATA extern const unsigned short fz_unicode_from_pdf_doc_encoding[256]; +FZ_DATA extern const unsigned short fz_unicode_from_windows_1250[256]; +FZ_DATA extern const unsigned short fz_unicode_from_windows_1251[256]; +FZ_DATA extern const unsigned short fz_unicode_from_windows_1252[256]; + +int fz_iso8859_1_from_unicode(int u); +int fz_iso8859_7_from_unicode(int u); +int fz_koi8u_from_unicode(int u); +int fz_windows_1250_from_unicode(int u); +int fz_windows_1251_from_unicode(int u); +int fz_windows_1252_from_unicode(int u); + +int fz_unicode_from_glyph_name(const char *name); +int fz_unicode_from_glyph_name_strict(const char *name); +const char **fz_duplicate_glyph_names_from_unicode(int unicode); +const char *fz_glyph_name_from_unicode_sc(int unicode); + +/** + An abstract font handle. +*/ +typedef struct fz_font fz_font; + +/** + Fonts come in two variants: + Regular fonts are handled by FreeType. + Type 3 fonts have callbacks to the interpreter. +*/ + +/** + Retrieve the FT_Face handle + for the font. + + font: The font to query + + Returns the FT_Face handle for the font, or NULL + if not a freetype handled font. (Cast to void * + to avoid nasty header exposure). +*/ +void *fz_font_ft_face(fz_context *ctx, fz_font *font); + +/** + Retrieve the Type3 procs + for a font. + + font: The font to query + + Returns the t3_procs pointer. Will be NULL for a + non type-3 font. +*/ +fz_buffer **fz_font_t3_procs(fz_context *ctx, fz_font *font); + +/* common CJK font collections */ +enum { FZ_ADOBE_CNS, FZ_ADOBE_GB, FZ_ADOBE_JAPAN, FZ_ADOBE_KOREA }; + +/** + Every fz_font carries a set of flags + within it, in a fz_font_flags_t structure. +*/ +typedef struct +{ + unsigned int is_mono : 1; + unsigned int is_serif : 1; + unsigned int is_bold : 1; + unsigned int is_italic : 1; + unsigned int ft_substitute : 1; /* use substitute metrics */ + unsigned int ft_stretch : 1; /* stretch to match PDF metrics */ + + unsigned int fake_bold : 1; /* synthesize bold */ + unsigned int fake_italic : 1; /* synthesize italic */ + unsigned int has_opentype : 1; /* has opentype shaping tables */ + unsigned int invalid_bbox : 1; + + unsigned int cjk : 1; + unsigned int cjk_lang : 2; /* CNS, GB, JAPAN, or KOREA */ + + unsigned int embed : 1; + unsigned int never_embed : 1; +} fz_font_flags_t; + +/** + Retrieve a pointer to the font flags + for a given font. These can then be updated as required. + + font: The font to query + + Returns a pointer to the flags structure (or NULL, if + the font is NULL). +*/ +fz_font_flags_t *fz_font_flags(fz_font *font); + +/** + In order to shape a given font, we need to + declare it to a shaper library (harfbuzz, by default, but others + are possible). To avoid redeclaring it every time we need to + shape, we hold a shaper handle and the destructor for it within + the font itself. The handle is initialised by the caller when + first required and the destructor is called when the fz_font is + destroyed. +*/ +typedef struct +{ + void *shaper_handle; + void (*destroy)(fz_context *ctx, void *); /* Destructor for shape_handle */ +} fz_shaper_data_t; + +/** + Retrieve a pointer to the shaper data + structure for the given font. + + font: The font to query. + + Returns a pointer to the shaper data structure (or NULL if + font is NULL). +*/ +fz_shaper_data_t *fz_font_shaper_data(fz_context *ctx, fz_font *font); + +/** + Retrieve a pointer to the name of the font. + + font: The font to query. + + Returns a pointer to an internal copy of the font name. + Will never be NULL, but may be the empty string. +*/ +const char *fz_font_name(fz_context *ctx, fz_font *font); + +/** + Query whether the font flags say that this font is bold. +*/ +int fz_font_is_bold(fz_context *ctx, fz_font *font); + +/** + Query whether the font flags say that this font is italic. +*/ +int fz_font_is_italic(fz_context *ctx, fz_font *font); + +/** + Query whether the font flags say that this font is serif. +*/ +int fz_font_is_serif(fz_context *ctx, fz_font *font); + +/** + Query whether the font flags say that this font is monospaced. +*/ +int fz_font_is_monospaced(fz_context *ctx, fz_font *font); + +/** + Retrieve the font bbox. + + font: The font to query. + + Returns the font bbox by value; it is valid only if + fz_font_flags(font)->invalid_bbox is zero. +*/ +fz_rect fz_font_bbox(fz_context *ctx, fz_font *font); + +/** + Type for user supplied system font loading hook. + + name: The name of the font to load. + + bold: 1 if a bold font desired, 0 otherwise. + + italic: 1 if an italic font desired, 0 otherwise. + needs_exact_metrics: 1 if an exact metric match is required for + the font requested. + + Returns a new font handle, or NULL if no font found (or on error). +*/ +typedef fz_font *(fz_load_system_font_fn)(fz_context *ctx, const char *name, int bold, int italic, int needs_exact_metrics); + +/** + Type for user supplied cjk font loading hook. + + name: The name of the font to load. + + ordering: The ordering for which to load the font (e.g. + FZ_ADOBE_KOREA) + + serif: 1 if a serif font is desired, 0 otherwise. + + Returns a new font handle, or NULL if no font found (or on error). +*/ +typedef fz_font *(fz_load_system_cjk_font_fn)(fz_context *ctx, const char *name, int ordering, int serif); + +/** + Type for user supplied fallback font loading hook. + + name: The name of the font to load. + + script: UCDN script enum. + + language: FZ_LANG enum. + + serif, bold, italic: boolean style flags. + + Returns a new font handle, or NULL if no font found (or on error). +*/ +typedef fz_font *(fz_load_system_fallback_font_fn)(fz_context *ctx, int script, int language, int serif, int bold, int italic); + +/** + Install functions to allow MuPDF to request fonts from the + system. + + Only one set of hooks can be in use at a time. +*/ +void fz_install_load_system_font_funcs(fz_context *ctx, + fz_load_system_font_fn *f, + fz_load_system_cjk_font_fn *f_cjk, + fz_load_system_fallback_font_fn *f_fallback); + +/** + Attempt to load a given font from the system. + + name: The name of the desired font. + + bold: 1 if bold desired, 0 otherwise. + + italic: 1 if italic desired, 0 otherwise. + + needs_exact_metrics: 1 if an exact metrical match is required, + 0 otherwise. + + Returns a new font handle, or NULL if no matching font was found + (or on error). +*/ +fz_font *fz_load_system_font(fz_context *ctx, const char *name, int bold, int italic, int needs_exact_metrics); + +/** + Attempt to load a given font from + the system. + + name: The name of the desired font. + + ordering: The ordering to load the font from (e.g. FZ_ADOBE_KOREA) + + serif: 1 if serif desired, 0 otherwise. + + Returns a new font handle, or NULL if no matching font was found + (or on error). +*/ +fz_font *fz_load_system_cjk_font(fz_context *ctx, const char *name, int ordering, int serif); + +/** + Search the builtin fonts for a match. + Whether a given font is present or not will depend on the + configuration in which MuPDF is built. + + name: The name of the font desired. + + bold: 1 if bold desired, 0 otherwise. + + italic: 1 if italic desired, 0 otherwise. + + len: Pointer to a place to receive the length of the discovered + font buffer. + + Returns a pointer to the font file data, or NULL if not present. +*/ +const unsigned char *fz_lookup_builtin_font(fz_context *ctx, const char *name, int bold, int italic, int *len); + +/** + Search the builtin base14 fonts for a match. + Whether a given font is present or not will depend on the + configuration in which MuPDF is built. + + name: The name of the font desired. + + len: Pointer to a place to receive the length of the discovered + font buffer. + + Returns a pointer to the font file data, or NULL if not present. +*/ +const unsigned char *fz_lookup_base14_font(fz_context *ctx, const char *name, int *len); + +/** + Search the builtin cjk fonts for a match. + Whether a font is present or not will depend on the + configuration in which MuPDF is built. + + ordering: The desired ordering of the font (e.g. FZ_ADOBE_KOREA). + + len: Pointer to a place to receive the length of the discovered + font buffer. + + Returns a pointer to the font file data, or NULL if not present. +*/ +const unsigned char *fz_lookup_cjk_font(fz_context *ctx, int ordering, int *len, int *index); + +/** + Search the builtin cjk fonts for a match for a given language. + Whether a font is present or not will depend on the + configuration in which MuPDF is built. + + lang: Pointer to a (case sensitive) language string (e.g. + "ja", "ko", "zh-Hant" etc). + + len: Pointer to a place to receive the length of the discovered + font buffer. + + subfont: Pointer to a place to store the subfont index of the + discovered font. + + Returns a pointer to the font file data, or NULL if not present. +*/ +const unsigned char *fz_lookup_cjk_font_by_language(fz_context *ctx, const char *lang, int *len, int *subfont); + +/** + Return the matching FZ_ADOBE_* ordering + for the given language tag, such as "zh-Hant", "zh-Hans", "ja", or "ko". +*/ +int fz_lookup_cjk_ordering_by_language(const char *name); + +/** + Search the builtin noto fonts for a match. + Whether a font is present or not will depend on the + configuration in which MuPDF is built. + + script: The script desired (e.g. UCDN_SCRIPT_KATAKANA). + + lang: The language desired (e.g. FZ_LANG_ja). + + len: Pointer to a place to receive the length of the discovered + font buffer. + + Returns a pointer to the font file data, or NULL if not present. +*/ +const unsigned char *fz_lookup_noto_font(fz_context *ctx, int script, int lang, int *len, int *subfont); + +/** + Search the builtin noto fonts specific symbol fonts. + Whether a font is present or not will depend on the + configuration in which MuPDF is built. +*/ +const unsigned char *fz_lookup_noto_math_font(fz_context *ctx, int *len); +const unsigned char *fz_lookup_noto_music_font(fz_context *ctx, int *len); +const unsigned char *fz_lookup_noto_symbol1_font(fz_context *ctx, int *len); +const unsigned char *fz_lookup_noto_symbol2_font(fz_context *ctx, int *len); +const unsigned char *fz_lookup_noto_emoji_font(fz_context *ctx, int *len); + +/** + Try to load a fallback font for the + given combination of font attributes. Whether a font is + present or not will depend on the configuration in which + MuPDF is built. + + script: The script desired (e.g. UCDN_SCRIPT_KATAKANA). + + language: The language desired (e.g. FZ_LANG_ja). + + serif: 1 if serif desired, 0 otherwise. + + bold: 1 if bold desired, 0 otherwise. + + italic: 1 if italic desired, 0 otherwise. + + Returns a new font handle, or NULL if not available. +*/ +fz_font *fz_load_fallback_font(fz_context *ctx, int script, int language, int serif, int bold, int italic); + +/** + Create a new (empty) type3 font. + + name: Name of font (or NULL). + + matrix: Font matrix. + + Returns a new font handle, or throws exception on + allocation failure. +*/ +fz_font *fz_new_type3_font(fz_context *ctx, const char *name, fz_matrix matrix); + +/** + Create a new font from a font file in memory. + + Fonts created in this way, will be eligible for embedding by default. + + name: Name of font (leave NULL to use name from font). + + data: Pointer to the font file data. + + len: Length of the font file data. + + index: Which font from the file to load (0 for default). + + use_glyph_box: 1 if we should use the glyph bbox, 0 otherwise. + + Returns new font handle, or throws exception on error. +*/ +fz_font *fz_new_font_from_memory(fz_context *ctx, const char *name, const unsigned char *data, int len, int index, int use_glyph_bbox); + +/** + Create a new font from a font file in a fz_buffer. + + Fonts created in this way, will be eligible for embedding by default. + + name: Name of font (leave NULL to use name from font). + + buffer: Buffer to load from. + + index: Which font from the file to load (0 for default). + + use_glyph_box: 1 if we should use the glyph bbox, 0 otherwise. + + Returns new font handle, or throws exception on error. +*/ +fz_font *fz_new_font_from_buffer(fz_context *ctx, const char *name, fz_buffer *buffer, int index, int use_glyph_bbox); + +/** + Create a new font from a font file. + + Fonts created in this way, will be eligible for embedding by default. + + name: Name of font (leave NULL to use name from font). + + path: File path to load from. + + index: Which font from the file to load (0 for default). + + use_glyph_box: 1 if we should use the glyph bbox, 0 otherwise. + + Returns new font handle, or throws exception on error. +*/ +fz_font *fz_new_font_from_file(fz_context *ctx, const char *name, const char *path, int index, int use_glyph_bbox); + +/** + Create a new font from one of the built-in fonts. +*/ +fz_font *fz_new_base14_font(fz_context *ctx, const char *name); +fz_font *fz_new_cjk_font(fz_context *ctx, int ordering); +fz_font *fz_new_builtin_font(fz_context *ctx, const char *name, int is_bold, int is_italic); + +/** + Control whether a given font should be embedded or not when writing. +*/ +void fz_set_font_embedding(fz_context *ctx, fz_font *font, int embed); + +/** + Add a reference to an existing fz_font. + + font: The font to add a reference to. + + Returns the same font. +*/ +fz_font *fz_keep_font(fz_context *ctx, fz_font *font); + +/** + Drop a reference to a fz_font, destroying the + font when the last reference is dropped. + + font: The font to drop a reference to. +*/ +void fz_drop_font(fz_context *ctx, fz_font *font); + +/** + Set the font bbox. + + font: The font to set the bbox for. + + xmin, ymin, xmax, ymax: The bounding box. +*/ +void fz_set_font_bbox(fz_context *ctx, fz_font *font, float xmin, float ymin, float xmax, float ymax); + +/** + Return a bbox for a given glyph in a font. + + font: The font to look for the glyph in. + + gid: The glyph to bound. + + trm: The matrix to apply to the glyph before bounding. + + Returns rectangle by value containing the bounds of the given + glyph. +*/ +fz_rect fz_bound_glyph(fz_context *ctx, fz_font *font, int gid, fz_matrix trm); + +/** + Determine if a given glyph in a font + is cacheable. Certain glyphs in a type 3 font cannot safely + be cached, as their appearance depends on the enclosing + graphic state. + + font: The font to look for the glyph in. + + gif: The glyph to query. + + Returns non-zero if cacheable, 0 if not. +*/ +int fz_glyph_cacheable(fz_context *ctx, fz_font *font, int gid); + +/** + Run a glyph from a Type3 font to + a given device. + + font: The font to find the glyph in. + + gid: The glyph to run. + + trm: The transform to apply. + + dev: The device to render onto. +*/ +void fz_run_t3_glyph(fz_context *ctx, fz_font *font, int gid, fz_matrix trm, struct fz_device *dev); + +/** + Return the advance for a given glyph. + + font: The font to look for the glyph in. + + glyph: The glyph to find the advance for. + + wmode: 1 for vertical mode, 0 for horizontal. + + Returns the advance for the glyph. +*/ +float fz_advance_glyph(fz_context *ctx, fz_font *font, int glyph, int wmode); + +/** + Find the glyph id for a given unicode + character within a font. + + font: The font to look for the unicode character in. + + unicode: The unicode character to encode. + + Returns the glyph id for the given unicode value, or 0 if + unknown. +*/ +int fz_encode_character(fz_context *ctx, fz_font *font, int unicode); + +/** + Encode character, preferring small-caps variant if available. + + font: The font to look for the unicode character in. + + unicode: The unicode character to encode. + + Returns the glyph id for the given unicode value, or 0 if + unknown. +*/ +int fz_encode_character_sc(fz_context *ctx, fz_font *font, int unicode); + +/** + Encode character. + + Either by direct lookup of glyphname within a font, or, failing + that, by mapping glyphname to unicode and thence to the glyph + index within the given font. + + Returns zero for type3 fonts. +*/ +int fz_encode_character_by_glyph_name(fz_context *ctx, fz_font *font, const char *glyphname); + +/** + Find the glyph id for + a given unicode character within a font, falling back to + an alternative if not found. + + font: The font to look for the unicode character in. + + unicode: The unicode character to encode. + + script: The script in use. + + language: The language in use. + + out_font: The font handle in which the given glyph represents + the requested unicode character. The caller does not own the + reference it is passed, so should call fz_keep_font if it is + not simply to be used immediately. + + Returns the glyph id for the given unicode value in the supplied + font (and sets *out_font to font) if it is present. Otherwise + an alternative fallback font (based on script/language) is + searched for. If the glyph is found therein, *out_font is set + to this reference, and the glyph reference is returned. If it + cannot be found anywhere, the function returns 0. +*/ +int fz_encode_character_with_fallback(fz_context *ctx, fz_font *font, int unicode, int script, int language, fz_font **out_font); + +/** + Find the name of a glyph + + font: The font to look for the glyph in. + + glyph: The glyph id to look for. + + buf: Pointer to a buffer for the name to be inserted into. + + size: The size of the buffer. + + If a font contains a name table, then the name of the glyph + will be returned in the supplied buffer. Otherwise a name + is synthesised. The name will be truncated to fit in + the buffer. +*/ +void fz_get_glyph_name(fz_context *ctx, fz_font *font, int glyph, char *buf, int size); + +/** + Retrieve font ascender in ems. +*/ +float fz_font_ascender(fz_context *ctx, fz_font *font); + +/** + Retrieve font descender in ems. +*/ +float fz_font_descender(fz_context *ctx, fz_font *font); + +/** + Retrieve the MD5 digest for the font's data. +*/ +void fz_font_digest(fz_context *ctx, fz_font *font, unsigned char digest[16]); + +/* Implementation details: subject to change. */ + +void fz_decouple_type3_font(fz_context *ctx, fz_font *font, void *t3doc); + +/** + map an FT error number to a + static string. + + err: The error number to lookup. + + Returns a pointer to a static textual representation + of a freetype error. +*/ +const char *ft_error_string(int err); +int ft_char_index(void *face, int cid); +int ft_name_index(void *face, const char *name); + +/** + Internal functions for our Harfbuzz integration + to work around the lack of thread safety. +*/ + +/** + Lock against Harfbuzz being called + simultaneously in several threads. This reuses + FZ_LOCK_FREETYPE. +*/ +void fz_hb_lock(fz_context *ctx); + +/** + Unlock after a Harfbuzz call. This reuses + FZ_LOCK_FREETYPE. +*/ +void fz_hb_unlock(fz_context *ctx); + +struct fz_font +{ + int refs; + char name[32]; + fz_buffer *buffer; + + fz_font_flags_t flags; + + void *ft_face; /* has an FT_Face if used */ + fz_shaper_data_t shaper_data; + + fz_matrix t3matrix; + void *t3resources; + fz_buffer **t3procs; /* has 256 entries if used */ + struct fz_display_list **t3lists; /* has 256 entries if used */ + float *t3widths; /* has 256 entries if used */ + unsigned short *t3flags; /* has 256 entries if used */ + void *t3doc; /* a pdf_document for the callback */ + void (*t3run)(fz_context *ctx, void *doc, void *resources, fz_buffer *contents, struct fz_device *dev, fz_matrix ctm, void *gstate, fz_default_colorspaces *default_cs); + void (*t3freeres)(fz_context *ctx, void *doc, void *resources); + + fz_rect bbox; /* font bbox is used only for t3 fonts */ + + int glyph_count; + + /* per glyph bounding box cache. */ + fz_rect **bbox_table; + int use_glyph_bbox; + + /* substitute metrics */ + int width_count; + short width_default; /* in 1000 units */ + short *width_table; /* in 1000 units */ + + /* cached glyph metrics */ + float **advance_cache; + + /* cached encoding lookup */ + uint16_t *encoding_cache[256]; + + /* cached md5sum for caching */ + int has_digest; + unsigned char digest[16]; + + /* Which font to use in a collection. */ + int subfont; +}; + +#endif diff --git a/include/mupdf/fitz/geometry.h b/include/mupdf/fitz/geometry.h new file mode 100644 index 0000000..57ca0e8 --- /dev/null +++ b/include/mupdf/fitz/geometry.h @@ -0,0 +1,818 @@ +// Copyright (C) 2004-2022 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_MATH_H +#define MUPDF_FITZ_MATH_H + +#include "mupdf/fitz/system.h" + +#include + +/** + Multiply scaled two integers in the 0..255 range +*/ +static inline int fz_mul255(int a, int b) +{ + /* see Jim Blinn's book "Dirty Pixels" for how this works */ + int x = a * b + 128; + x += x >> 8; + return x >> 8; +} + +/** + Undo alpha premultiplication. +*/ +static inline int fz_div255(int c, int a) +{ + return a ? c * (255 * 256 / a) >> 8 : 0; +} + +/** + Expand a value A from the 0...255 range to the 0..256 range +*/ +#define FZ_EXPAND(A) ((A)+((A)>>7)) + +/** + Combine values A (in any range) and B (in the 0..256 range), + to give a single value in the same range as A was. +*/ +#define FZ_COMBINE(A,B) (((A)*(B))>>8) + +/** + Combine values A and C (in the same (any) range) and B and D (in + the 0..256 range), to give a single value in the same range as A + and C were. +*/ +#define FZ_COMBINE2(A,B,C,D) (((A) * (B) + (C) * (D))>>8) + +/** + Blend SRC and DST (in the same range) together according to + AMOUNT (in the 0...256 range). +*/ +#define FZ_BLEND(SRC, DST, AMOUNT) ((((SRC)-(DST))*(AMOUNT) + ((DST)<<8))>>8) + +/** + Range checking atof +*/ +float fz_atof(const char *s); + +/** + atoi that copes with NULL +*/ +int fz_atoi(const char *s); + +/** + 64bit atoi that copes with NULL +*/ +int64_t fz_atoi64(const char *s); + +/** + Some standard math functions, done as static inlines for speed. + People with compilers that do not adequately implement inline + may like to reimplement these using macros. +*/ +static inline float fz_abs(float f) +{ + return (f < 0 ? -f : f); +} + +static inline int fz_absi(int i) +{ + return (i < 0 ? -i : i); +} + +static inline float fz_min(float a, float b) +{ + return (a < b ? a : b); +} + +static inline int fz_mini(int a, int b) +{ + return (a < b ? a : b); +} + +static inline size_t fz_minz(size_t a, size_t b) +{ + return (a < b ? a : b); +} + +static inline int64_t fz_mini64(int64_t a, int64_t b) +{ + return (a < b ? a : b); +} + +static inline float fz_max(float a, float b) +{ + return (a > b ? a : b); +} + +static inline int fz_maxi(int a, int b) +{ + return (a > b ? a : b); +} + +static inline size_t fz_maxz(size_t a, size_t b) +{ + return (a > b ? a : b); +} + +static inline int64_t fz_maxi64(int64_t a, int64_t b) +{ + return (a > b ? a : b); +} + +static inline float fz_clamp(float x, float min, float max) +{ + return x < min ? min : x > max ? max : x; +} + +static inline int fz_clampi(int x, int min, int max) +{ + return x < min ? min : x > max ? max : x; +} + +static inline int64_t fz_clamp64(int64_t x, int64_t min, int64_t max) +{ + return x < min ? min : x > max ? max : x; +} + +static inline double fz_clampd(double x, double min, double max) +{ + return x < min ? min : x > max ? max : x; +} + +static inline void *fz_clampp(void *x, void *min, void *max) +{ + return x < min ? min : x > max ? max : x; +} + +#define DIV_BY_ZERO(a, b, min, max) (((a) < 0) ^ ((b) < 0) ? (min) : (max)) + +/** + fz_point is a point in a two-dimensional space. +*/ +typedef struct +{ + float x, y; +} fz_point; + +static inline fz_point fz_make_point(float x, float y) +{ + fz_point p = { x, y }; + return p; +} + +/** + fz_rect is a rectangle represented by two diagonally opposite + corners at arbitrary coordinates. + + Rectangles are always axis-aligned with the X- and Y- axes. We + wish to distinguish rectangles in 3 categories; infinite, finite, + and invalid. Zero area rectangles are a sub-category of finite + ones. + + For all valid rectangles, x0 <= x1 and y0 <= y1 in all cases. + Infinite rectangles have x0 = y0 = FZ_MIN_INF_RECT, + x1 = y1 = FZ_MAX_INF_RECT. For any non infinite valid rectangle, + the area is defined as (x1 - x0) * (y1 - y0). + + To check for empty or infinite rectangles use fz_is_empty_rect + and fz_is_infinite_rect. To check for valid rectangles use + fz_is_valid_rect. + + We choose this representation, so that we can easily distinguish + the difference between intersecting 2 valid rectangles and + getting an invalid one, as opposed to getting a zero area one + (which nonetheless has valid bounds within the plane). + + x0, y0: The top left corner. + + x1, y1: The bottom right corner. + + We choose FZ_{MIN,MAX}_INF_RECT to be the largest 32bit signed + integer values that survive roundtripping to floats. +*/ +#define FZ_MIN_INF_RECT ((int)0x80000000) +#define FZ_MAX_INF_RECT ((int)0x7fffff80) + +typedef struct +{ + float x0, y0; + float x1, y1; +} fz_rect; + +static inline fz_rect fz_make_rect(float x0, float y0, float x1, float y1) +{ + fz_rect r = { x0, y0, x1, y1 }; + return r; +} + +/** + fz_irect is a rectangle using integers instead of floats. + + It's used in the draw device and for pixmap dimensions. +*/ +typedef struct +{ + int x0, y0; + int x1, y1; +} fz_irect; + +static inline fz_irect fz_make_irect(int x0, int y0, int x1, int y1) +{ + fz_irect r = { x0, y0, x1, y1 }; + return r; +} + +/** + A rectangle with sides of length one. + + The bottom left corner is at (0, 0) and the top right corner + is at (1, 1). +*/ +FZ_DATA extern const fz_rect fz_unit_rect; + +/** + An empty rectangle with an area equal to zero. +*/ +FZ_DATA extern const fz_rect fz_empty_rect; +FZ_DATA extern const fz_irect fz_empty_irect; + +/** + An infinite rectangle. +*/ +FZ_DATA extern const fz_rect fz_infinite_rect; +FZ_DATA extern const fz_irect fz_infinite_irect; + +/** + Check if rectangle is empty. + + An empty rectangle is defined as one whose area is zero. + All invalid rectangles are empty. +*/ +static inline int fz_is_empty_rect(fz_rect r) +{ + return (r.x0 >= r.x1 || r.y0 >= r.y1); +} + +static inline int fz_is_empty_irect(fz_irect r) +{ + return (r.x0 >= r.x1 || r.y0 >= r.y1); +} + +/** + Check if rectangle is infinite. +*/ +static inline int fz_is_infinite_rect(fz_rect r) +{ + return (r.x0 == FZ_MIN_INF_RECT && r.x1 == FZ_MAX_INF_RECT && + r.y0 == FZ_MIN_INF_RECT && r.y1 == FZ_MAX_INF_RECT); +} + +/** + Check if an integer rectangle + is infinite. +*/ +static inline int fz_is_infinite_irect(fz_irect r) +{ + return (r.x0 == FZ_MIN_INF_RECT && r.x1 == FZ_MAX_INF_RECT && + r.y0 == FZ_MIN_INF_RECT && r.y1 == FZ_MAX_INF_RECT); +} + +/** + Check if rectangle is valid. +*/ +static inline int fz_is_valid_rect(fz_rect r) +{ + return (r.x0 <= r.x1 && r.y0 <= r.y1); +} + +/** + Check if an integer rectangle is valid. +*/ +static inline int fz_is_valid_irect(fz_irect r) +{ + return (r.x0 <= r.x1 && r.y0 <= r.y1); +} + +/** + Return the width of an irect. Invalid irects return 0. +*/ +static inline unsigned int +fz_irect_width(fz_irect r) +{ + unsigned int w; + if (r.x0 >= r.x1) + return 0; + /* Check for w overflowing. This should never happen, but + * if it does, it's pretty likely an indication of a severe + * problem. */ + w = (unsigned int)r.x1 - r.x0; + assert((int)w >= 0); + if ((int)w < 0) + return 0; + return (int)w; +} + +/** + Return the height of an irect. Invalid irects return 0. +*/ +static inline int +fz_irect_height(fz_irect r) +{ + unsigned int h; + if (r.y0 >= r.y1) + return 0; + /* Check for h overflowing. This should never happen, but + * if it does, it's pretty likely an indication of a severe + * problem. */ + h = (unsigned int)(r.y1 - r.y0); + assert((int)h >= 0); + if ((int)h < 0) + return 0; + return (int)h; +} + +/** + fz_matrix is a row-major 3x3 matrix used for representing + transformations of coordinates throughout MuPDF. + + Since all points reside in a two-dimensional space, one vector + is always a constant unit vector; hence only some elements may + vary in a matrix. Below is how the elements map between + different representations. + + / a b 0 \ + | c d 0 | normally represented as [ a b c d e f ]. + \ e f 1 / +*/ +typedef struct +{ + float a, b, c, d, e, f; +} fz_matrix; + +/** + Identity transform matrix. +*/ +FZ_DATA extern const fz_matrix fz_identity; + +static inline fz_matrix fz_make_matrix(float a, float b, float c, float d, float e, float f) +{ + fz_matrix m = { a, b, c, d, e, f }; + return m; +} + +static inline int fz_is_identity(fz_matrix m) +{ + return m.a == 1 && m.b == 0 && m.c == 0 && m.d == 1 && m.e == 0 && m.f == 0; +} + +/** + Multiply two matrices. + + The order of the two matrices are important since matrix + multiplication is not commutative. + + Returns result. +*/ +fz_matrix fz_concat(fz_matrix left, fz_matrix right); + +/** + Create a scaling matrix. + + The returned matrix is of the form [ sx 0 0 sy 0 0 ]. + + m: Pointer to the matrix to populate + + sx, sy: Scaling factors along the X- and Y-axes. A scaling + factor of 1.0 will not cause any scaling along the relevant + axis. + + Returns m. +*/ +fz_matrix fz_scale(float sx, float sy); + +/** + Scale a matrix by premultiplication. + + m: Pointer to the matrix to scale + + sx, sy: Scaling factors along the X- and Y-axes. A scaling + factor of 1.0 will not cause any scaling along the relevant + axis. + + Returns m (updated). +*/ +fz_matrix fz_pre_scale(fz_matrix m, float sx, float sy); + +/** + Scale a matrix by postmultiplication. + + m: Pointer to the matrix to scale + + sx, sy: Scaling factors along the X- and Y-axes. A scaling + factor of 1.0 will not cause any scaling along the relevant + axis. + + Returns m (updated). +*/ +fz_matrix fz_post_scale(fz_matrix m, float sx, float sy); + +/** + Create a shearing matrix. + + The returned matrix is of the form [ 1 sy sx 1 0 0 ]. + + m: pointer to place to store returned matrix + + sx, sy: Shearing factors. A shearing factor of 0.0 will not + cause any shearing along the relevant axis. + + Returns m. +*/ +fz_matrix fz_shear(float sx, float sy); + +/** + Premultiply a matrix with a shearing matrix. + + The shearing matrix is of the form [ 1 sy sx 1 0 0 ]. + + m: pointer to matrix to premultiply + + sx, sy: Shearing factors. A shearing factor of 0.0 will not + cause any shearing along the relevant axis. + + Returns m (updated). +*/ +fz_matrix fz_pre_shear(fz_matrix m, float sx, float sy); + +/** + Create a rotation matrix. + + The returned matrix is of the form + [ cos(deg) sin(deg) -sin(deg) cos(deg) 0 0 ]. + + m: Pointer to place to store matrix + + degrees: Degrees of counter clockwise rotation. Values less + than zero and greater than 360 are handled as expected. + + Returns m. +*/ +fz_matrix fz_rotate(float degrees); + +/** + Rotate a transformation by premultiplying. + + The premultiplied matrix is of the form + [ cos(deg) sin(deg) -sin(deg) cos(deg) 0 0 ]. + + m: Pointer to matrix to premultiply. + + degrees: Degrees of counter clockwise rotation. Values less + than zero and greater than 360 are handled as expected. + + Returns m (updated). +*/ +fz_matrix fz_pre_rotate(fz_matrix m, float degrees); + +/** + Create a translation matrix. + + The returned matrix is of the form [ 1 0 0 1 tx ty ]. + + m: A place to store the created matrix. + + tx, ty: Translation distances along the X- and Y-axes. A + translation of 0 will not cause any translation along the + relevant axis. + + Returns m. +*/ +fz_matrix fz_translate(float tx, float ty); + +/** + Translate a matrix by premultiplication. + + m: The matrix to translate + + tx, ty: Translation distances along the X- and Y-axes. A + translation of 0 will not cause any translation along the + relevant axis. + + Returns m. +*/ +fz_matrix fz_pre_translate(fz_matrix m, float tx, float ty); + +/** + Create transform matrix to draw page + at a given resolution and rotation. Adjusts the scaling + factors so that the page covers whole number of + pixels and adjust the page origin to be at 0,0. +*/ +fz_matrix fz_transform_page(fz_rect mediabox, float resolution, float rotate); + +/** + Create an inverse matrix. + + inverse: Place to store inverse matrix. + + matrix: Matrix to invert. A degenerate matrix, where the + determinant is equal to zero, can not be inverted and the + original matrix is returned instead. + + Returns inverse. +*/ +fz_matrix fz_invert_matrix(fz_matrix matrix); + +/** + Attempt to create an inverse matrix. + + inverse: Place to store inverse matrix. + + matrix: Matrix to invert. A degenerate matrix, where the + determinant is equal to zero, can not be inverted. + + Returns 1 if matrix is degenerate (singular), or 0 otherwise. +*/ +int fz_try_invert_matrix(fz_matrix *inv, fz_matrix src); + +/** + Check if a transformation is rectilinear. + + Rectilinear means that no shearing is present and that any + rotations present are a multiple of 90 degrees. Usually this + is used to make sure that axis-aligned rectangles before the + transformation are still axis-aligned rectangles afterwards. +*/ +int fz_is_rectilinear(fz_matrix m); + +/** + Calculate average scaling factor of matrix. +*/ +float fz_matrix_expansion(fz_matrix m); + +/** + Compute intersection of two rectangles. + + Given two rectangles, update the first to be the smallest + axis-aligned rectangle that covers the area covered by both + given rectangles. If either rectangle is empty then the + intersection is also empty. If either rectangle is infinite + then the intersection is simply the non-infinite rectangle. + Should both rectangles be infinite, then the intersection is + also infinite. +*/ +fz_rect fz_intersect_rect(fz_rect a, fz_rect b); + +/** + Compute intersection of two bounding boxes. + + Similar to fz_intersect_rect but operates on two bounding + boxes instead of two rectangles. +*/ +fz_irect fz_intersect_irect(fz_irect a, fz_irect b); + +/** + Compute union of two rectangles. + + Given two rectangles, update the first to be the smallest + axis-aligned rectangle that encompasses both given rectangles. + If either rectangle is infinite then the union is also infinite. + If either rectangle is empty then the union is simply the + non-empty rectangle. Should both rectangles be empty, then the + union is also empty. +*/ +fz_rect fz_union_rect(fz_rect a, fz_rect b); + +/** + Convert a rect into the minimal bounding box + that covers the rectangle. + + Coordinates in a bounding box are integers, so rounding of the + rects coordinates takes place. The top left corner is rounded + upwards and left while the bottom right corner is rounded + downwards and to the right. +*/ +fz_irect fz_irect_from_rect(fz_rect rect); + +/** + Round rectangle coordinates. + + Coordinates in a bounding box are integers, so rounding of the + rects coordinates takes place. The top left corner is rounded + upwards and left while the bottom right corner is rounded + downwards and to the right. + + This differs from fz_irect_from_rect, in that fz_irect_from_rect + slavishly follows the numbers (i.e any slight over/under + calculations can cause whole extra pixels to be added). + fz_round_rect allows for a small amount of rounding error when + calculating the bbox. +*/ +fz_irect fz_round_rect(fz_rect rect); + +/** + Convert a bbox into a rect. + + For our purposes, a rect can represent all the values we meet in + a bbox, so nothing can go wrong. + + rect: A place to store the generated rectangle. + + bbox: The bbox to convert. + + Returns rect (updated). +*/ +fz_rect fz_rect_from_irect(fz_irect bbox); + +/** + Expand a bbox by a given amount in all directions. +*/ +fz_rect fz_expand_rect(fz_rect b, float expand); +fz_irect fz_expand_irect(fz_irect a, int expand); + +/** + Expand a bbox to include a given point. + To create a rectangle that encompasses a sequence of points, the + rectangle must first be set to be the empty rectangle at one of + the points before including the others. +*/ +fz_rect fz_include_point_in_rect(fz_rect r, fz_point p); + +/** + Translate bounding box. + + Translate a bbox by a given x and y offset. Allows for overflow. +*/ +fz_rect fz_translate_rect(fz_rect a, float xoff, float yoff); +fz_irect fz_translate_irect(fz_irect a, int xoff, int yoff); + +/** + Test rectangle inclusion. + + Return true if a entirely contains b. +*/ +int fz_contains_rect(fz_rect a, fz_rect b); + +/** + Apply a transformation to a point. + + transform: Transformation matrix to apply. See fz_concat, + fz_scale, fz_rotate and fz_translate for how to create a + matrix. + + point: Pointer to point to update. + + Returns transform (unchanged). +*/ +fz_point fz_transform_point(fz_point point, fz_matrix m); +fz_point fz_transform_point_xy(float x, float y, fz_matrix m); + +/** + Apply a transformation to a vector. + + transform: Transformation matrix to apply. See fz_concat, + fz_scale and fz_rotate for how to create a matrix. Any + translation will be ignored. + + vector: Pointer to vector to update. +*/ +fz_point fz_transform_vector(fz_point vector, fz_matrix m); + +/** + Apply a transform to a rectangle. + + After the four corner points of the axis-aligned rectangle + have been transformed it may not longer be axis-aligned. So a + new axis-aligned rectangle is created covering at least the + area of the transformed rectangle. + + transform: Transformation matrix to apply. See fz_concat, + fz_scale and fz_rotate for how to create a matrix. + + rect: Rectangle to be transformed. The two special cases + fz_empty_rect and fz_infinite_rect, may be used but are + returned unchanged as expected. +*/ +fz_rect fz_transform_rect(fz_rect rect, fz_matrix m); + +/** + Normalize a vector to length one. +*/ +fz_point fz_normalize_vector(fz_point p); + +/** + Grid fit a matrix. + + as_tiled = 0 => adjust the matrix so that the image of the unit + square completely covers any pixel that was touched by the + image of the unit square under the original matrix. + + as_tiled = 1 => adjust the matrix so that the corners of the + image of the unit square align with the closest integer corner + of the image of the unit square under the original matrix. +*/ +fz_matrix fz_gridfit_matrix(int as_tiled, fz_matrix m); + +/** + Find the largest expansion performed by this matrix. + (i.e. max(abs(m.a),abs(m.b),abs(m.c),abs(m.d)) +*/ +float fz_matrix_max_expansion(fz_matrix m); + +/** + A representation for a region defined by 4 points. + + The significant difference between quads and rects is that + the edges of quads are not axis aligned. +*/ +typedef struct +{ + fz_point ul, ur, ll, lr; +} fz_quad; + +/** + Inline convenience construction function. +*/ +static inline fz_quad fz_make_quad( + float ul_x, float ul_y, + float ur_x, float ur_y, + float ll_x, float ll_y, + float lr_x, float lr_y) +{ + fz_quad q = { + { ul_x, ul_y }, + { ur_x, ur_y }, + { ll_x, ll_y }, + { lr_x, lr_y }, + }; + return q; +} + +/** + Convert a rect to a quad (losslessly). +*/ +fz_quad fz_quad_from_rect(fz_rect r); + +/** + Convert a quad to the smallest rect that covers it. +*/ +fz_rect fz_rect_from_quad(fz_quad q); + +/** + Transform a quad by a matrix. +*/ +fz_quad fz_transform_quad(fz_quad q, fz_matrix m); + +/** + Inclusion test for quads. +*/ +int fz_is_point_inside_quad(fz_point p, fz_quad q); + +/** + Inclusion test for rects. (Rect is assumed to be open, i.e. + top right corner is not included). +*/ +int fz_is_point_inside_rect(fz_point p, fz_rect r); + +/** + Inclusion test for irects. (Rect is assumed to be open, i.e. + top right corner is not included). +*/ +int fz_is_point_inside_irect(int x, int y, fz_irect r); + +/** + Inclusion test for quad in quad. + + This may break down if quads are not 'well formed'. +*/ +int fz_is_quad_inside_quad(fz_quad needle, fz_quad haystack); + +/** + Intersection test for quads. + + This may break down if quads are not 'well formed'. +*/ +int fz_is_quad_intersecting_quad(fz_quad a, fz_quad b); + +#endif diff --git a/include/mupdf/fitz/getopt.h b/include/mupdf/fitz/getopt.h new file mode 100644 index 0000000..13677d1 --- /dev/null +++ b/include/mupdf/fitz/getopt.h @@ -0,0 +1,35 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_GETOPT_H +#define MUPDF_FITZ_GETOPT_H + +#include "export.h" + +/** + Simple functions/variables for use in tools. +*/ +extern int fz_getopt(int nargc, char * const *nargv, const char *ostr); +FZ_DATA extern int fz_optind; +FZ_DATA extern char *fz_optarg; + +#endif diff --git a/include/mupdf/fitz/glyph-cache.h b/include/mupdf/fitz/glyph-cache.h new file mode 100644 index 0000000..c4b5fcf --- /dev/null +++ b/include/mupdf/fitz/glyph-cache.h @@ -0,0 +1,96 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_GLYPH_CACHE_H +#define MUPDF_FITZ_GLYPH_CACHE_H + +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/geometry.h" +#include "mupdf/fitz/font.h" +#include "mupdf/fitz/pixmap.h" +#include "mupdf/fitz/device.h" + +/** + Purge all the glyphs from the cache. +*/ +void fz_purge_glyph_cache(fz_context *ctx); + +/** + Create a pixmap containing a rendered glyph. + + Lookup gid from font, clip it with scissor, and rendering it + with aa bits of antialiasing into a new pixmap. + + The caller takes ownership of the pixmap and so must free it. + + Note: This function is no longer used for normal rendering + operations, and is kept around just because we use it in the + app. It should be considered "at risk" of removal from the API. +*/ +fz_pixmap *fz_render_glyph_pixmap(fz_context *ctx, fz_font *font, int gid, fz_matrix *ctm, const fz_irect *scissor, int aa); + +/** + Nasty PDF interpreter specific hernia, required to allow the + interpreter to replay glyphs from a type3 font directly into + the target device. + + This is only used in exceptional circumstances (such as type3 + glyphs that inherit current graphics state, or nested type3 + glyphs). +*/ +void fz_render_t3_glyph_direct(fz_context *ctx, fz_device *dev, fz_font *font, int gid, fz_matrix trm, void *gstate, fz_default_colorspaces *def_cs); + +/** + Force a type3 font to cache the displaylist for a given glyph + id. + + This caching can involve reading the underlying file, so must + happen ahead of time, so we aren't suddenly forced to read the + file while playing a displaylist back. +*/ +void fz_prepare_t3_glyph(fz_context *ctx, fz_font *font, int gid); + +/** + Dump debug statistics for the glyph cache. +*/ +void fz_dump_glyph_cache_stats(fz_context *ctx, fz_output *out); + +/** + Perform subpixel quantisation and adjustment on a glyph matrix. + + ctm: On entry, the desired 'ideal' transformation for a glyph. + On exit, adjusted to a (very similar) transformation quantised + for subpixel caching. + + subpix_ctm: Initialised by the routine to the transform that + should be used to render the glyph. + + qe, qf: which subpixel position we quantised to. + + Returns: the size of the glyph. + + Note: This is currently only exposed for use in our app. It + should be considered "at risk" of removal from the API. +*/ +float fz_subpixel_adjust(fz_context *ctx, fz_matrix *ctm, fz_matrix *subpix_ctm, unsigned char *qe, unsigned char *qf); + +#endif diff --git a/include/mupdf/fitz/glyph.h b/include/mupdf/fitz/glyph.h new file mode 100644 index 0000000..960a4ff --- /dev/null +++ b/include/mupdf/fitz/glyph.h @@ -0,0 +1,81 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_GLYPH_H +#define MUPDF_FITZ_GLYPH_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/geometry.h" +#include "mupdf/fitz/store.h" +#include "mupdf/fitz/font.h" +#include "mupdf/fitz/path.h" + +/** + Glyphs represent a run length encoded set of pixels for a 2 + dimensional region of a plane. +*/ +typedef struct fz_glyph fz_glyph; + +/** + Return the bounding box of the glyph in pixels. +*/ +fz_irect fz_glyph_bbox(fz_context *ctx, fz_glyph *glyph); +fz_irect fz_glyph_bbox_no_ctx(fz_glyph *src); + +/** + Return the width of the glyph in pixels. +*/ +int fz_glyph_width(fz_context *ctx, fz_glyph *glyph); + +/** + Return the height of the glyph in pixels. +*/ +int fz_glyph_height(fz_context *ctx, fz_glyph *glyph); + +/** + Take a reference to a glyph. + + pix: The glyph to increment the reference for. + + Returns pix. +*/ +fz_glyph *fz_keep_glyph(fz_context *ctx, fz_glyph *pix); + +/** + Drop a reference and free a glyph. + + Decrement the reference count for the glyph. When no + references remain the glyph will be freed. +*/ +void fz_drop_glyph(fz_context *ctx, fz_glyph *pix); + +/** + Look a glyph up from a font, and return the outline of the + glyph using the given transform. + + The caller owns the returned path, and so is responsible for + ensuring that it eventually gets dropped. +*/ +fz_path *fz_outline_glyph(fz_context *ctx, fz_font *font, int gid, fz_matrix ctm); + +#endif diff --git a/include/mupdf/fitz/hash.h b/include/mupdf/fitz/hash.h new file mode 100644 index 0000000..873cf32 --- /dev/null +++ b/include/mupdf/fitz/hash.h @@ -0,0 +1,126 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_HASH_H +#define MUPDF_FITZ_HASH_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/output.h" + +#define FZ_HASH_TABLE_KEY_LENGTH 48 + +/** + Generic hash-table with fixed-length keys. + + The keys and values are NOT reference counted by the hash table. + Callers are responsible for taking care the reference counts are + correct. Inserting a duplicate entry will NOT overwrite the old + value, and will return the old value. + + The drop_val callback function is only used to release values + when the hash table is destroyed. +*/ + +typedef struct fz_hash_table fz_hash_table; + +/** + Function type called when a hash table entry is dropped. + + Only used when the entire hash table is dropped. +*/ +typedef void (fz_hash_table_drop_fn)(fz_context *ctx, void *val); + +/** + Create a new hash table. + + initialsize: The initial size of the hashtable. The hashtable + may grow (double in size) if it starts to get crowded (80% + full). + + keylen: byte length for each key. + + lock: -1 for no lock, otherwise the FZ_LOCK to use to protect + this table. + + drop_val: Function to use to destroy values on table drop. +*/ +fz_hash_table *fz_new_hash_table(fz_context *ctx, int initialsize, int keylen, int lock, fz_hash_table_drop_fn *drop_val); + +/** + Destroy the hash table. + + Values are dropped using the drop function. +*/ +void fz_drop_hash_table(fz_context *ctx, fz_hash_table *table); + +/** + Search for a matching hash within the table, and return the + associated value. +*/ +void *fz_hash_find(fz_context *ctx, fz_hash_table *table, const void *key); + +/** + Insert a new key/value pair into the hash table. + + If an existing entry with the same key is found, no change is + made to the hash table, and a pointer to the existing value is + returned. + + If no existing entry with the same key is found, ownership of + val passes in, key is copied, and NULL is returned. +*/ +void *fz_hash_insert(fz_context *ctx, fz_hash_table *table, const void *key, void *val); + +/** + Remove the entry for a given key. + + The value is NOT freed, so the caller is expected to take care + of this. +*/ +void fz_hash_remove(fz_context *ctx, fz_hash_table *table, const void *key); + +/** + Callback function called on each key/value pair in the hash + table, when fz_hash_for_each is run. +*/ +typedef void (fz_hash_table_for_each_fn)(fz_context *ctx, void *state, void *key, int keylen, void *val); + +/** + Iterate over the entries in a hash table. +*/ +void fz_hash_for_each(fz_context *ctx, fz_hash_table *table, void *state, fz_hash_table_for_each_fn *callback); + +/** + Callback function called on each key/value pair in the hash + table, when fz_hash_filter is run to remove entries where the + callback returns true. +*/ +typedef int (fz_hash_table_filter_fn)(fz_context *ctx, void *state, void *key, int keylen, void *val); + +/** + Iterate over the entries in a hash table, removing all the ones where callback returns true. + Does NOT free the value of the entry, so the caller is expected to take care of this. +*/ +void fz_hash_filter(fz_context *ctx, fz_hash_table *table, void *state, fz_hash_table_filter_fn *callback); + +#endif diff --git a/include/mupdf/fitz/image.h b/include/mupdf/fitz/image.h new file mode 100644 index 0000000..972f8fb --- /dev/null +++ b/include/mupdf/fitz/image.h @@ -0,0 +1,428 @@ +// Copyright (C) 2004-2023 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_IMAGE_H +#define MUPDF_FITZ_IMAGE_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/store.h" +#include "mupdf/fitz/pixmap.h" + +#include "mupdf/fitz/buffer.h" +#include "mupdf/fitz/stream.h" +#include "mupdf/fitz/compressed-buffer.h" + +/** + Images are storable objects from which we can obtain fz_pixmaps. + These may be implemented as simple wrappers around a pixmap, or + as more complex things that decode at different subsample + settings on demand. +*/ +typedef struct fz_image fz_image; +typedef struct fz_compressed_image fz_compressed_image; +typedef struct fz_pixmap_image fz_pixmap_image; + +/** + Called to get a handle to a pixmap from an image. + + image: The image to retrieve a pixmap from. + + subarea: The subarea of the image that we actually care about + (or NULL to indicate the whole image). + + ctm: Optional, unless subarea is given. If given, then on + entry this is the transform that will be applied to the complete + image. It should be updated on exit to the transform to apply to + the given subarea of the image. This is used to calculate the + desired width/height for subsampling. + + w: If non-NULL, a pointer to an int to be updated on exit to the + width (in pixels) that the scaled output will cover. + + h: If non-NULL, a pointer to an int to be updated on exit to the + height (in pixels) that the scaled output will cover. + + Returns a non NULL kept pixmap pointer. May throw exceptions. +*/ +fz_pixmap *fz_get_pixmap_from_image(fz_context *ctx, fz_image *image, const fz_irect *subarea, fz_matrix *ctm, int *w, int *h); + +/** + Calls fz_get_pixmap_from_image() with ctm, subarea, w and h all set to NULL. +*/ +fz_pixmap *fz_get_unscaled_pixmap_from_image(fz_context *ctx, fz_image *image); + +/** + Increment the (normal) reference count for an image. Returns the + same pointer. + + Never throws exceptions. +*/ +fz_image *fz_keep_image(fz_context *ctx, fz_image *image); + +/** + Decrement the (normal) reference count for an image. When the + total (normal + key) reference count reaches zero, the image and + its resources are freed. + + Never throws exceptions. +*/ +void fz_drop_image(fz_context *ctx, fz_image *image); + +/** + Increment the store key reference for an image. Returns the same + pointer. (This is the count of references for an image held by + keys in the image store). + + Never throws exceptions. +*/ +fz_image *fz_keep_image_store_key(fz_context *ctx, fz_image *image); + +/** + Decrement the store key reference count for an image. When the + total (normal + key) reference count reaches zero, the image and + its resources are freed. + + Never throws exceptions. +*/ +void fz_drop_image_store_key(fz_context *ctx, fz_image *image); + +/** + Function type to destroy an images data + when it's reference count reaches zero. +*/ +typedef void (fz_drop_image_fn)(fz_context *ctx, fz_image *image); + +/** + Function type to get a decoded pixmap for an image. + + im: The image to decode. + + subarea: NULL, or the subarea of the image required. Expressed + in terms of a rectangle in the original width/height of the + image. If non NULL, this should be updated by the function to + the actual subarea decoded - which must include the requested + area! + + w, h: The actual width and height that the whole image would + need to be decoded to. + + l2factor: On entry, the log 2 subsample factor required. If + possible the decode process can take care of (all or some) of + this subsampling, and must then update the value so the caller + knows what remains to be done. + + Returns a reference to a decoded pixmap that satisfies the + requirements of the request. The caller owns the returned + reference. +*/ +typedef fz_pixmap *(fz_image_get_pixmap_fn)(fz_context *ctx, fz_image *im, fz_irect *subarea, int w, int h, int *l2factor); + +/** + Function type to get the given storage + size for an image. + + Returns the size in bytes used for a given image. +*/ +typedef size_t (fz_image_get_size_fn)(fz_context *, fz_image *); + +/** + Internal function to make a new fz_image structure + for a derived class. + + w,h: Width and height of the created image. + + bpc: Bits per component. + + colorspace: The colorspace (determines the number of components, + and any color conversions required while decoding). + + xres, yres: The X and Y resolutions respectively. + + interpolate: 1 if interpolation should be used when decoding + this image, 0 otherwise. + + imagemask: 1 if this is an imagemask (i.e. transparent), 0 + otherwise. + + decode: NULL, or a pointer to to a decode array. The default + decode array is [0 1] (repeated n times, for n color components). + + colorkey: NULL, or a pointer to a colorkey array. The default + colorkey array is [0 255] (repeated n times, for n color + components). + + mask: NULL, or another image to use as a mask for this one. + A new reference is taken to this image. Supplying a masked + image as a mask to another image is illegal! + + size: The size of the required allocated structure (the size of + the derived structure). + + get: The function to be called to obtain a decoded pixmap. + + get_size: The function to be called to return the storage size + used by this image. + + drop: The function to be called to dispose of this image once + the last reference is dropped. + + Returns a pointer to an allocated structure of the required size, + with the first sizeof(fz_image) bytes initialised as appropriate + given the supplied parameters, and the other bytes set to zero. +*/ +fz_image *fz_new_image_of_size(fz_context *ctx, + int w, + int h, + int bpc, + fz_colorspace *colorspace, + int xres, + int yres, + int interpolate, + int imagemask, + float *decode, + int *colorkey, + fz_image *mask, + size_t size, + fz_image_get_pixmap_fn *get_pixmap, + fz_image_get_size_fn *get_size, + fz_drop_image_fn *drop); + +#define fz_new_derived_image(CTX,W,H,B,CS,X,Y,I,IM,D,C,M,T,G,S,Z) \ + ((T*)Memento_label(fz_new_image_of_size(CTX,W,H,B,CS,X,Y,I,IM,D,C,M,sizeof(T),G,S,Z),#T)) + +/** + Create an image based on + the data in the supplied compressed buffer. + + w,h: Width and height of the created image. + + bpc: Bits per component. + + colorspace: The colorspace (determines the number of components, + and any color conversions required while decoding). + + xres, yres: The X and Y resolutions respectively. + + interpolate: 1 if interpolation should be used when decoding + this image, 0 otherwise. + + imagemask: 1 if this is an imagemask (i.e. transparency bitmap + mask), 0 otherwise. + + decode: NULL, or a pointer to to a decode array. The default + decode array is [0 1] (repeated n times, for n color components). + + colorkey: NULL, or a pointer to a colorkey array. The default + colorkey array is [0 255] (repeated n times, for n color + components). + + buffer: Buffer of compressed data and compression parameters. + Ownership of this reference is passed in. + + mask: NULL, or another image to use as a mask for this one. + A new reference is taken to this image. Supplying a masked + image as a mask to another image is illegal! +*/ +fz_image *fz_new_image_from_compressed_buffer(fz_context *ctx, int w, int h, int bpc, fz_colorspace *colorspace, int xres, int yres, int interpolate, int imagemask, float *decode, int *colorkey, fz_compressed_buffer *buffer, fz_image *mask); + +/** + Create an image from the given + pixmap. + + pixmap: The pixmap to base the image upon. A new reference + to this is taken. + + mask: NULL, or another image to use as a mask for this one. + A new reference is taken to this image. Supplying a masked + image as a mask to another image is illegal! +*/ +fz_image *fz_new_image_from_pixmap(fz_context *ctx, fz_pixmap *pixmap, fz_image *mask); + +/** + Create a new image from a + buffer of data, inferring its type from the format + of the data. +*/ +fz_image *fz_new_image_from_buffer(fz_context *ctx, fz_buffer *buffer); + +/** + Create a new image from the contents + of a file, inferring its type from the format of the + data. +*/ +fz_image *fz_new_image_from_file(fz_context *ctx, const char *path); + +/** + Internal destructor exposed for fz_store integration. +*/ +void fz_drop_image_imp(fz_context *ctx, fz_storable *image); + +/** + Internal destructor for the base image class members. + + Exposed to allow derived image classes to be written. +*/ +void fz_drop_image_base(fz_context *ctx, fz_image *image); + +/** + Decode a subarea of a compressed image. l2factor is the amount + of subsampling inbuilt to the stream (i.e. performed by the + decoder). If non NULL, l2extra is the extra amount of + subsampling that should be performed by this routine. This will + be updated on exit to the amount of subsampling that is still + required to be done. + + Returns a kept reference. +*/ +fz_pixmap *fz_decomp_image_from_stream(fz_context *ctx, fz_stream *stm, fz_compressed_image *image, fz_irect *subarea, int indexed, int l2factor, int *l2extra); + +/** + Convert pixmap from indexed to base colorspace. + + This creates a new bitmap containing the converted pixmap data. + */ +fz_pixmap *fz_convert_indexed_pixmap_to_base(fz_context *ctx, const fz_pixmap *src); + +/** + Convert pixmap from DeviceN/Separation to base colorspace. + + This creates a new bitmap containing the converted pixmap data. +*/ +fz_pixmap *fz_convert_separation_pixmap_to_base(fz_context *ctx, const fz_pixmap *src); + +/** + Return the size of the storage used by an image. +*/ +size_t fz_image_size(fz_context *ctx, fz_image *im); + +/** + Structure is public to allow other structures to + be derived from it. Do not access members directly. +*/ +struct fz_image +{ + fz_key_storable key_storable; + int w, h; + uint8_t n; + uint8_t bpc; + unsigned int imagemask:1; + unsigned int interpolate:1; + unsigned int use_colorkey:1; + unsigned int use_decode:1; + unsigned int decoded:1; + unsigned int scalable:1; + uint8_t orientation; + fz_image *mask; + int xres; /* As given in the image, not necessarily as rendered */ + int yres; /* As given in the image, not necessarily as rendered */ + fz_colorspace *colorspace; + fz_drop_image_fn *drop_image; + fz_image_get_pixmap_fn *get_pixmap; + fz_image_get_size_fn *get_size; + int colorkey[FZ_MAX_COLORS * 2]; + float decode[FZ_MAX_COLORS * 2]; +}; + +/** + Request the natural resolution + of an image. + + xres, yres: Pointers to ints to be updated with the + natural resolution of an image (or a sensible default + if not encoded). +*/ +void fz_image_resolution(fz_image *image, int *xres, int *yres); + +/** + Request the natural orientation of an image. + + This is for images (such as JPEG) that can contain internal + specifications of rotation/flips. This is ignored by all the + internal decode/rendering routines, but can be used by callers + (such as the image document handler) to respect such + specifications. + + The values used by MuPDF are as follows, with the equivalent + Exif specifications given for information: + + 0: Undefined + 1: 0 degree ccw rotation. (Exif = 1) + 2: 90 degree ccw rotation. (Exif = 8) + 3: 180 degree ccw rotation. (Exif = 3) + 4: 270 degree ccw rotation. (Exif = 6) + 5: flip on X. (Exif = 2) + 6: flip on X, then rotate ccw by 90 degrees. (Exif = 5) + 7: flip on X, then rotate ccw by 180 degrees. (Exif = 4) + 8: flip on X, then rotate ccw by 270 degrees. (Exif = 7) +*/ +uint8_t fz_image_orientation(fz_context *ctx, fz_image *image); + +fz_matrix +fz_image_orientation_matrix(fz_context *ctx, fz_image *image); + +/** + Retrieve the underlying compressed data for an image. + + Returns a pointer to the underlying data buffer for an image, + or NULL if this image is not based upon a compressed data + buffer. + + This is not a reference counted structure, so no reference is + returned. Lifespan is limited to that of the image itself. +*/ +fz_compressed_buffer *fz_compressed_image_buffer(fz_context *ctx, fz_image *image); +void fz_set_compressed_image_buffer(fz_context *ctx, fz_compressed_image *cimg, fz_compressed_buffer *buf); + +/** + Retrieve the underlying fz_pixmap for an image. + + Returns a pointer to the underlying fz_pixmap for an image, + or NULL if this image is not based upon an fz_pixmap. + + No reference is returned. Lifespan is limited to that of + the image itself. If required, use fz_keep_pixmap to take + a reference to keep it longer. +*/ +fz_pixmap *fz_pixmap_image_tile(fz_context *ctx, fz_pixmap_image *cimg); +void fz_set_pixmap_image_tile(fz_context *ctx, fz_pixmap_image *cimg, fz_pixmap *pix); + +/* Implementation details: subject to change. */ + +/** + Exposed for PDF. +*/ +fz_pixmap *fz_load_jpx(fz_context *ctx, const unsigned char *data, size_t size, fz_colorspace *cs); + +/** + Exposed for CBZ. +*/ +int fz_load_tiff_subimage_count(fz_context *ctx, const unsigned char *buf, size_t len); +fz_pixmap *fz_load_tiff_subimage(fz_context *ctx, const unsigned char *buf, size_t len, int subimage); +int fz_load_pnm_subimage_count(fz_context *ctx, const unsigned char *buf, size_t len); +fz_pixmap *fz_load_pnm_subimage(fz_context *ctx, const unsigned char *buf, size_t len, int subimage); +int fz_load_jbig2_subimage_count(fz_context *ctx, const unsigned char *buf, size_t len); +fz_pixmap *fz_load_jbig2_subimage(fz_context *ctx, const unsigned char *buf, size_t len, int subimage); +int fz_load_bmp_subimage_count(fz_context *ctx, const unsigned char *buf, size_t len); +fz_pixmap *fz_load_bmp_subimage(fz_context *ctx, const unsigned char *buf, size_t len, int subimage); + +#endif diff --git a/include/mupdf/fitz/link.h b/include/mupdf/fitz/link.h new file mode 100644 index 0000000..1a20a22 --- /dev/null +++ b/include/mupdf/fitz/link.h @@ -0,0 +1,130 @@ +// Copyright (C) 2004-2022 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_LINK_H +#define MUPDF_FITZ_LINK_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/geometry.h" +#include "mupdf/fitz/types.h" + +typedef struct fz_link fz_link; +typedef void (fz_link_set_rect_fn)(fz_context *ctx, fz_link *link, fz_rect rect); +typedef void (fz_link_set_uri_fn)(fz_context *ctx, fz_link *link, const char *uri); +typedef void (fz_link_drop_link_fn)(fz_context *ctx, fz_link *link); + +/** + fz_link is a list of interactive links on a page. + + There is no relation between the order of the links in the + list and the order they appear on the page. The list of links + for a given page can be obtained from fz_load_links. + + A link is reference counted. Dropping a reference to a link is + done by calling fz_drop_link. + + rect: The hot zone. The area that can be clicked in + untransformed coordinates. + + uri: Link destinations come in two forms: internal and external. + Internal links refer to other pages in the same document. + External links are URLs to other documents. + + next: A pointer to the next link on the same page. +*/ +typedef struct fz_link +{ + int refs; + struct fz_link *next; + fz_rect rect; + char *uri; + fz_link_set_rect_fn *set_rect_fn; + fz_link_set_uri_fn *set_uri_fn; + fz_link_drop_link_fn *drop; +} fz_link; + +typedef enum +{ + FZ_LINK_DEST_FIT, + FZ_LINK_DEST_FIT_B, + FZ_LINK_DEST_FIT_H, + FZ_LINK_DEST_FIT_BH, + FZ_LINK_DEST_FIT_V, + FZ_LINK_DEST_FIT_BV, + FZ_LINK_DEST_FIT_R, + FZ_LINK_DEST_XYZ +} fz_link_dest_type; + +typedef struct +{ + fz_location loc; + fz_link_dest_type type; + float x, y, w, h, zoom; +} fz_link_dest; + +fz_link_dest fz_make_link_dest_none(void); +fz_link_dest fz_make_link_dest_xyz(int chapter, int page, float x, float y, float z); + +/** + Create a new link record. + + next is set to NULL with the expectation that the caller will + handle the linked list setup. Internal function. + + Different document types will be implemented by deriving from + fz_link. This macro allocates such derived structures, and + initialises the base sections. +*/ +fz_link *fz_new_link_of_size(fz_context *ctx, int size, fz_rect rect, const char *uri); +#define fz_new_derived_link(CTX,TYPE,RECT,URI) \ + ((TYPE *)Memento_label(fz_new_link_of_size(CTX,sizeof(TYPE),RECT,URI),#TYPE)) + +/** + Increment the reference count for a link. The same pointer is + returned. + + Never throws exceptions. +*/ +fz_link *fz_keep_link(fz_context *ctx, fz_link *link); + +/** + Decrement the reference count for a link. When the reference + count reaches zero, the link is destroyed. + + When a link is freed, the reference for any linked link (next) + is dropped too, thus an entire linked list of fz_link's can be + freed by just dropping the head. +*/ +void fz_drop_link(fz_context *ctx, fz_link *link); + +/** + Query whether a link is external to a document (determined by + uri containing a ':', intended to match with '://' which + separates the scheme from the scheme specific parts in URIs). +*/ +int fz_is_external_link(fz_context *ctx, const char *uri); + +void fz_set_link_rect(fz_context *ctx, fz_link *link, fz_rect rect); +void fz_set_link_uri(fz_context *ctx, fz_link *link, const char *uri); + +#endif diff --git a/include/mupdf/fitz/log.h b/include/mupdf/fitz/log.h new file mode 100644 index 0000000..50892a0 --- /dev/null +++ b/include/mupdf/fitz/log.h @@ -0,0 +1,61 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_LOG_H +#define MUPDF_FITZ_LOG_H + +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/output.h" + +/** + The functions in this file offer simple logging abilities. + + The default logfile is "fitz_log.txt". This can overridden by + defining an environment variable "FZ_LOG_FILE", or module + specific environment variables "FZ_LOG_FILE_" (e.g. + "FZ_LOG_FILE_STORE"). + + Enable the following define(s) to enable built in debug logging + from within the appropriate module(s). +*/ + +/* #define ENABLE_STORE_LOGGING */ + + +/** + Output a line to the log. +*/ +void fz_log(fz_context *ctx, const char *fmt, ...); + +/** + Output a line to the log for a given module. +*/ +void fz_log_module(fz_context *ctx, const char *module, const char *fmt, ...); + +/** + Internal function to actually do the opening of the logfile. + + Caller should close/drop the output when finished with it. +*/ +fz_output *fz_new_log_for_module(fz_context *ctx, const char *module); + +#endif diff --git a/include/mupdf/fitz/outline.h b/include/mupdf/fitz/outline.h new file mode 100644 index 0000000..6f5810c --- /dev/null +++ b/include/mupdf/fitz/outline.h @@ -0,0 +1,228 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_OUTLINE_H +#define MUPDF_FITZ_OUTLINE_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/types.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/link.h" +#include "mupdf/fitz/output.h" + +/* Outline */ + +typedef struct { + char *title; + char *uri; + int is_open; +} fz_outline_item; + +typedef struct fz_outline_iterator fz_outline_iterator; + +/** + Call to get the current outline item. + + Can return NULL. The item is only valid until the next call. +*/ +fz_outline_item *fz_outline_iterator_item(fz_context *ctx, fz_outline_iterator *iter); + +/** + Calls to move the iterator position. + + A negative return value means we could not move as requested. Otherwise: + 0 = the final position has a valid item. + 1 = not a valid item, but we can insert an item here. +*/ +int fz_outline_iterator_next(fz_context *ctx, fz_outline_iterator *iter); +int fz_outline_iterator_prev(fz_context *ctx, fz_outline_iterator *iter); +int fz_outline_iterator_up(fz_context *ctx, fz_outline_iterator *iter); +int fz_outline_iterator_down(fz_context *ctx, fz_outline_iterator *iter); + +/** + Call to insert a new item BEFORE the current point. + + Ownership of pointers are retained by the caller. The item data will be copied. + + After an insert, we do not change where we are pointing. + The return code is the same as for next, it indicates the current iterator position. +*/ +int fz_outline_iterator_insert(fz_context *ctx, fz_outline_iterator *iter, fz_outline_item *item); + +/** + Delete the current item. + + This implicitly moves us to the 'next' item, and the return code is as for fz_outline_iterator_next. +*/ +int fz_outline_iterator_delete(fz_context *ctx, fz_outline_iterator *iter); + +/** + Update the current item properties according to the given item. +*/ +void fz_outline_iterator_update(fz_context *ctx, fz_outline_iterator *iter, fz_outline_item *item); + +/** + Drop the current iterator. +*/ +void fz_drop_outline_iterator(fz_context *ctx, fz_outline_iterator *iter); + + +/** Structure based API */ + +/** + fz_outline is a tree of the outline of a document (also known + as table of contents). + + title: Title of outline item using UTF-8 encoding. May be NULL + if the outline item has no text string. + + uri: Destination in the document to be displayed when this + outline item is activated. May be an internal or external + link, or NULL if the outline item does not have a destination. + + page: The page number of an internal link, or -1 for external + links or links with no destination. + + next: The next outline item at the same level as this outline + item. May be NULL if no more outline items exist at this level. + + down: The outline items immediate children in the hierarchy. + May be NULL if no children exist. +*/ +typedef struct fz_outline +{ + int refs; + char *title; + char *uri; + fz_location page; + float x, y; + struct fz_outline *next; + struct fz_outline *down; + int is_open; +} fz_outline; + +/** + Create a new outline entry with zeroed fields for the caller + to fill in. +*/ +fz_outline *fz_new_outline(fz_context *ctx); + +/** + Increment the reference count. Returns the same pointer. + + Never throws exceptions. +*/ +fz_outline *fz_keep_outline(fz_context *ctx, fz_outline *outline); + +/** + Decrements the reference count. When the reference point + reaches zero, the outline is freed. + + When freed, it will drop linked outline entries (next and down) + too, thus a whole outline structure can be dropped by dropping + the top entry. + + Never throws exceptions. +*/ +void fz_drop_outline(fz_context *ctx, fz_outline *outline); + +/** + Routine to implement the old Structure based API from an iterator. +*/ +fz_outline * +fz_load_outline_from_iterator(fz_context *ctx, fz_outline_iterator *iter); + + +/** + Implementation details. + Of use to people coding new document handlers. +*/ + +/** + Function type for getting the current item. + + Can return NULL. The item is only valid until the next call. +*/ +typedef fz_outline_item *(fz_outline_iterator_item_fn)(fz_context *ctx, fz_outline_iterator *iter); + +/** + Function types for moving the iterator position. + + A negative return value means we could not move as requested. Otherwise: + 0 = the final position has a valid item. + 1 = not a valid item, but we can insert an item here. +*/ +typedef int (fz_outline_iterator_next_fn)(fz_context *ctx, fz_outline_iterator *iter); +typedef int (fz_outline_iterator_prev_fn)(fz_context *ctx, fz_outline_iterator *iter); +typedef int (fz_outline_iterator_up_fn)(fz_context *ctx, fz_outline_iterator *iter); +typedef int (fz_outline_iterator_down_fn)(fz_context *ctx, fz_outline_iterator *iter); + +/** + Function type for inserting a new item BEFORE the current point. + + Ownership of pointers are retained by the caller. The item data will be copied. + + After an insert, we implicitly do a next, so that a successive insert operation + would insert after the item inserted here. The return code is therefore as for next. +*/ +typedef int (fz_outline_iterator_insert_fn)(fz_context *ctx, fz_outline_iterator *iter, fz_outline_item *item); + +/** + Function type for deleting the current item. + + This implicitly moves us to the 'next' item, and the return code is as for fz_outline_iterator_next. +*/ +typedef int (fz_outline_iterator_delete_fn)(fz_context *ctx, fz_outline_iterator *iter); + +/** + Function type for updating the current item properties according to the given item. +*/ +typedef void (fz_outline_iterator_update_fn)(fz_context *ctx, fz_outline_iterator *iter, fz_outline_item *item); + +/** + Function type for dropping the current iterator. +*/ +typedef void (fz_outline_iterator_drop_fn)(fz_context *ctx, fz_outline_iterator *iter); + +#define fz_new_derived_outline_iter(CTX, TYPE, DOC)\ + ((TYPE *)Memento_label(fz_new_outline_iterator_of_size(ctx,sizeof(TYPE),DOC),#TYPE)) + +fz_outline_iterator *fz_new_outline_iterator_of_size(fz_context *ctx, size_t size, fz_document *doc); + +fz_outline_iterator *fz_outline_iterator_from_outline(fz_context *ctx, fz_outline *outline); + +struct fz_outline_iterator { + /* Functions */ + fz_outline_iterator_drop_fn *drop; + fz_outline_iterator_item_fn *item; + fz_outline_iterator_next_fn *next; + fz_outline_iterator_prev_fn *prev; + fz_outline_iterator_up_fn *up; + fz_outline_iterator_down_fn *down; + fz_outline_iterator_insert_fn *insert; + fz_outline_iterator_update_fn *update; + fz_outline_iterator_delete_fn *del; + /* Common state */ + fz_document *doc; +}; + +#endif diff --git a/include/mupdf/fitz/output-svg.h b/include/mupdf/fitz/output-svg.h new file mode 100644 index 0000000..b1b07ab --- /dev/null +++ b/include/mupdf/fitz/output-svg.h @@ -0,0 +1,64 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_OUTPUT_SVG_H +#define MUPDF_FITZ_OUTPUT_SVG_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/device.h" +#include "mupdf/fitz/output.h" + +enum { + FZ_SVG_TEXT_AS_PATH = 0, + FZ_SVG_TEXT_AS_TEXT = 1, +}; + +/** + Create a device that outputs (single page) SVG files to + the given output stream. + + Equivalent to fz_new_svg_device_with_id passing id = NULL. +*/ +fz_device *fz_new_svg_device(fz_context *ctx, fz_output *out, float page_width, float page_height, int text_format, int reuse_images); + +/** + Create a device that outputs (single page) SVG files to + the given output stream. + + output: The output stream to send the constructed SVG page to. + + page_width, page_height: The page dimensions to use (in points). + + text_format: How to emit text. One of the following values: + FZ_SVG_TEXT_AS_TEXT: As elements with possible + layout errors and mismatching fonts. + FZ_SVG_TEXT_AS_PATH: As elements with exact + visual appearance. + + reuse_images: Share image resources using definitions. + + id: ID parameter to keep generated IDs unique across SVG files. +*/ +fz_device *fz_new_svg_device_with_id(fz_context *ctx, fz_output *out, float page_width, float page_height, int text_format, int reuse_images, int *id); + +#endif diff --git a/include/mupdf/fitz/output.h b/include/mupdf/fitz/output.h new file mode 100644 index 0000000..350b00b --- /dev/null +++ b/include/mupdf/fitz/output.h @@ -0,0 +1,383 @@ +// Copyright (C) 2004-2022 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_OUTPUT_H +#define MUPDF_FITZ_OUTPUT_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/buffer.h" +#include "mupdf/fitz/string-util.h" +#include "mupdf/fitz/stream.h" + +/** + Generic output streams - generalise between outputting to a + file, a buffer, etc. +*/ + +/** + A function type for use when implementing + fz_outputs. The supplied function of this type is called + whenever data is written to the output. + + state: The state for the output stream. + + data: a pointer to a buffer of data to write. + + n: The number of bytes of data to write. +*/ +typedef void (fz_output_write_fn)(fz_context *ctx, void *state, const void *data, size_t n); + +/** + A function type for use when implementing + fz_outputs. The supplied function of this type is called when + fz_seek_output is requested. + + state: The output stream state to seek within. + + offset, whence: as defined for fs_seek_output. +*/ +typedef void (fz_output_seek_fn)(fz_context *ctx, void *state, int64_t offset, int whence); + +/** + A function type for use when implementing + fz_outputs. The supplied function of this type is called when + fz_tell_output is requested. + + state: The output stream state to report on. + + Returns the offset within the output stream. +*/ +typedef int64_t (fz_output_tell_fn)(fz_context *ctx, void *state); + +/** + A function type for use when implementing + fz_outputs. The supplied function of this type is called + when the output stream is closed, to flush any pending writes. +*/ +typedef void (fz_output_close_fn)(fz_context *ctx, void *state); + +/** + A function type for use when implementing + fz_outputs. The supplied function of this type is called + when the output stream is dropped, to release the stream + specific state information. +*/ +typedef void (fz_output_drop_fn)(fz_context *ctx, void *state); + +/** + A function type for use when implementing + fz_outputs. The supplied function of this type is called + when the fz_stream_from_output is called. +*/ +typedef fz_stream *(fz_stream_from_output_fn)(fz_context *ctx, void *state); + +/** + A function type for use when implementing + fz_outputs. The supplied function of this type is called + when fz_truncate_output is called to truncate the file + at that point. +*/ +typedef void (fz_truncate_fn)(fz_context *ctx, void *state); + +struct fz_output +{ + void *state; + fz_output_write_fn *write; + fz_output_seek_fn *seek; + fz_output_tell_fn *tell; + fz_output_close_fn *close; + fz_output_drop_fn *drop; + fz_stream_from_output_fn *as_stream; + fz_truncate_fn *truncate; + char *bp, *wp, *ep; + /* If buffered is non-zero, then we have that many + * bits (1-7) waiting to be written in bits. */ + int buffered; + int bits; +}; + +/** + Create a new output object with the given + internal state and function pointers. + + state: Internal state (opaque to everything but implementation). + + write: Function to output a given buffer. + + close: Cleanup function to destroy state when output closed. + May permissibly be null. +*/ +fz_output *fz_new_output(fz_context *ctx, int bufsiz, void *state, fz_output_write_fn *write, fz_output_close_fn *close, fz_output_drop_fn *drop); + +/** + Open an output stream that writes to a + given path. + + filename: The filename to write to (specified in UTF-8). + + append: non-zero if we should append to the file, rather than + overwriting it. +*/ +fz_output *fz_new_output_with_path(fz_context *, const char *filename, int append); + +/** + Open an output stream that appends + to a buffer. + + buf: The buffer to append to. +*/ +fz_output *fz_new_output_with_buffer(fz_context *ctx, fz_buffer *buf); + +/** + Retrieve an fz_output that directs to stdout. + + Optionally may be fz_dropped when finished with. +*/ +fz_output *fz_stdout(fz_context *ctx); + +/** + Retrieve an fz_output that directs to stdout. + + Optionally may be fz_dropped when finished with. +*/ +fz_output *fz_stderr(fz_context *ctx); + +#ifdef _WIN32 +/** + Retrieve an fz_output that directs to OutputDebugString. + + Optionally may be fz_dropped when finished with. +*/ +fz_output *fz_stdods(fz_context *ctx); +#endif + +/** + Set the output stream to be used for fz_stddbg. Set to NULL to + reset to default (stderr). +*/ +void fz_set_stddbg(fz_context *ctx, fz_output *out); + +/** + Retrieve an fz_output for the default debugging stream. On + Windows this will be OutputDebugString for non-console apps. + Otherwise, it is always fz_stderr. + + Optionally may be fz_dropped when finished with. +*/ +fz_output *fz_stddbg(fz_context *ctx); + +/** + Format and write data to an output stream. + See fz_format_string for formatting details. +*/ +void fz_write_printf(fz_context *ctx, fz_output *out, const char *fmt, ...); + +/** + va_list version of fz_write_printf. +*/ +void fz_write_vprintf(fz_context *ctx, fz_output *out, const char *fmt, va_list ap); + +/** + Seek to the specified file position. + See fseek for arguments. + + Throw an error on unseekable outputs. +*/ +void fz_seek_output(fz_context *ctx, fz_output *out, int64_t off, int whence); + +/** + Return the current file position. + + Throw an error on untellable outputs. +*/ +int64_t fz_tell_output(fz_context *ctx, fz_output *out); + +/** + Flush unwritten data. +*/ +void fz_flush_output(fz_context *ctx, fz_output *out); + +/** + Flush pending output and close an output stream. +*/ +void fz_close_output(fz_context *, fz_output *); + +/** + Free an output stream. Don't forget to close it first! +*/ +void fz_drop_output(fz_context *, fz_output *); + +/** + Query whether a given fz_output supports fz_stream_from_output. +*/ +int fz_output_supports_stream(fz_context *ctx, fz_output *out); + +/** + Obtain the fz_output in the form of a fz_stream. + + This allows data to be read back from some forms of fz_output + object. When finished reading, the fz_stream should be released + by calling fz_drop_stream. Until the fz_stream is dropped, no + further operations should be performed on the fz_output object. +*/ +fz_stream *fz_stream_from_output(fz_context *, fz_output *); + +/** + Truncate the output at the current position. + + This allows output streams which have seeked back from the end + of their storage to be truncated at the current point. +*/ +void fz_truncate_output(fz_context *, fz_output *); + +/** + Write data to output. + + data: Pointer to data to write. + size: Size of data to write in bytes. +*/ +void fz_write_data(fz_context *ctx, fz_output *out, const void *data, size_t size); +void fz_write_buffer(fz_context *ctx, fz_output *out, fz_buffer *data); + +/** + Write a string. Does not write zero terminator. +*/ +void fz_write_string(fz_context *ctx, fz_output *out, const char *s); + +/** + Write different sized data to an output stream. +*/ +void fz_write_int32_be(fz_context *ctx, fz_output *out, int x); +void fz_write_int32_le(fz_context *ctx, fz_output *out, int x); +void fz_write_uint32_be(fz_context *ctx, fz_output *out, unsigned int x); +void fz_write_uint32_le(fz_context *ctx, fz_output *out, unsigned int x); +void fz_write_int16_be(fz_context *ctx, fz_output *out, int x); +void fz_write_int16_le(fz_context *ctx, fz_output *out, int x); +void fz_write_uint16_be(fz_context *ctx, fz_output *out, unsigned int x); +void fz_write_uint16_le(fz_context *ctx, fz_output *out, unsigned int x); +void fz_write_char(fz_context *ctx, fz_output *out, char x); +void fz_write_byte(fz_context *ctx, fz_output *out, unsigned char x); +void fz_write_float_be(fz_context *ctx, fz_output *out, float f); +void fz_write_float_le(fz_context *ctx, fz_output *out, float f); + +/** + Write a UTF-8 encoded unicode character. +*/ +void fz_write_rune(fz_context *ctx, fz_output *out, int rune); + +/** + Write a base64 encoded data block, optionally with periodic + newlines. +*/ +void fz_write_base64(fz_context *ctx, fz_output *out, const unsigned char *data, size_t size, int newline); + +/** + Write a base64 encoded fz_buffer, optionally with periodic + newlines. +*/ +void fz_write_base64_buffer(fz_context *ctx, fz_output *out, fz_buffer *data, int newline); + +/** + Write num_bits of data to the end of the output stream, assumed to be packed + most significant bits first. +*/ +void fz_write_bits(fz_context *ctx, fz_output *out, unsigned int data, int num_bits); + +/** + Sync to byte boundary after writing bits. +*/ +void fz_write_bits_sync(fz_context *ctx, fz_output *out); + +/** + Our customised 'printf'-like string formatter. + Takes %c, %d, %s, %u, %x, as usual. + Modifiers are not supported except for zero-padding ints (e.g. + %02d, %03u, %04x, etc). + %g output in "as short as possible hopefully lossless + non-exponent" form, see fz_ftoa for specifics. + %f and %e output as usual. + %C outputs a utf8 encoded int. + %M outputs a fz_matrix*. + %R outputs a fz_rect*. + %P outputs a fz_point*. + %n outputs a PDF name (with appropriate escaping). + %q and %( output escaped strings in C/PDF syntax. + %l{d,u,x} indicates that the values are int64_t. + %z{d,u,x} indicates that the value is a size_t. + + user: An opaque pointer that is passed to the emit function. + + emit: A function pointer called to emit output bytes as the + string is being formatted. +*/ +void fz_format_string(fz_context *ctx, void *user, void (*emit)(fz_context *ctx, void *user, int c), const char *fmt, va_list args); + +/** + A vsnprintf work-alike, using our custom formatter. +*/ +size_t fz_vsnprintf(char *buffer, size_t space, const char *fmt, va_list args); + +/** + The non va_list equivalent of fz_vsnprintf. +*/ +size_t fz_snprintf(char *buffer, size_t space, const char *fmt, ...); + +/** + Allocated sprintf. + + Returns a null terminated allocated block containing the + formatted version of the format string/args. +*/ +char *fz_asprintf(fz_context *ctx, const char *fmt, ...); + +/** + Save the contents of a buffer to a file. +*/ +void fz_save_buffer(fz_context *ctx, fz_buffer *buf, const char *filename); + +/** + Compression and other filtering outputs. + + These outputs write encoded data to another output. Create a + filter output with the destination, write to the filter, then + close and drop it when you're done. These can also be chained + together, for example to write ASCII Hex encoded, Deflate + compressed, and RC4 encrypted data to a buffer output. + + Output streams don't use reference counting, so make sure to + close all of the filters in the reverse order of creation so + that data is flushed properly. + + Accordingly, ownership of 'chain' is never passed into the + following functions, but remains with the caller, whose + responsibility it is to ensure they exist at least until + the returned fz_output is dropped. +*/ + +fz_output *fz_new_asciihex_output(fz_context *ctx, fz_output *chain); +fz_output *fz_new_ascii85_output(fz_context *ctx, fz_output *chain); +fz_output *fz_new_rle_output(fz_context *ctx, fz_output *chain); +fz_output *fz_new_arc4_output(fz_context *ctx, fz_output *chain, unsigned char *key, size_t keylen); +fz_output *fz_new_deflate_output(fz_context *ctx, fz_output *chain, int effort, int raw); + +#endif diff --git a/include/mupdf/fitz/path.h b/include/mupdf/fitz/path.h new file mode 100644 index 0000000..78c9b08 --- /dev/null +++ b/include/mupdf/fitz/path.h @@ -0,0 +1,447 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_PATH_H +#define MUPDF_FITZ_PATH_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/geometry.h" + +/** + * Vector path buffer. + * It can be stroked and dashed, or be filled. + * It has a fill rule (nonzero or even_odd). + * + * When rendering, they are flattened, stroked and dashed straight + * into the Global Edge List. + */ + +typedef struct fz_path fz_path; + +typedef enum +{ + FZ_LINECAP_BUTT = 0, + FZ_LINECAP_ROUND = 1, + FZ_LINECAP_SQUARE = 2, + FZ_LINECAP_TRIANGLE = 3 +} fz_linecap; + +typedef enum +{ + FZ_LINEJOIN_MITER = 0, + FZ_LINEJOIN_ROUND = 1, + FZ_LINEJOIN_BEVEL = 2, + FZ_LINEJOIN_MITER_XPS = 3 +} fz_linejoin; + +typedef struct +{ + int refs; + fz_linecap start_cap, dash_cap, end_cap; + fz_linejoin linejoin; + float linewidth; + float miterlimit; + float dash_phase; + int dash_len; + float dash_list[32]; +} fz_stroke_state; + +typedef struct +{ + /* Compulsory ones */ + void (*moveto)(fz_context *ctx, void *arg, float x, float y); + void (*lineto)(fz_context *ctx, void *arg, float x, float y); + void (*curveto)(fz_context *ctx, void *arg, float x1, float y1, float x2, float y2, float x3, float y3); + void (*closepath)(fz_context *ctx, void *arg); + /* Optional ones */ + void (*quadto)(fz_context *ctx, void *arg, float x1, float y1, float x2, float y2); + void (*curvetov)(fz_context *ctx, void *arg, float x2, float y2, float x3, float y3); + void (*curvetoy)(fz_context *ctx, void *arg, float x1, float y1, float x3, float y3); + void (*rectto)(fz_context *ctx, void *arg, float x1, float y1, float x2, float y2); +} fz_path_walker; + +/** + Walk the segments of a path, calling the + appropriate callback function from a given set for each + segment of the path. + + path: The path to walk. + + walker: The set of callback functions to use. The first + 4 callback pointers in the set must be non-NULL. The + subsequent ones can either be supplied, or can be left + as NULL, in which case the top 4 functions will be + called as appropriate to simulate them. + + arg: An opaque argument passed in to each callback. + + Exceptions will only be thrown if the underlying callback + functions throw them. +*/ +void fz_walk_path(fz_context *ctx, const fz_path *path, const fz_path_walker *walker, void *arg); + +/** + Create a new (empty) path structure. +*/ +fz_path *fz_new_path(fz_context *ctx); + +/** + Increment the reference count. Returns the same pointer. + + All paths can be kept, regardless of their packing type. + + Never throws exceptions. +*/ +fz_path *fz_keep_path(fz_context *ctx, const fz_path *path); + +/** + Decrement the reference count. When the reference count hits + zero, free the path. + + All paths can be dropped, regardless of their packing type. + Packed paths do not own the blocks into which they are packed + so dropping them does not free those blocks. + + Never throws exceptions. +*/ +void fz_drop_path(fz_context *ctx, const fz_path *path); + +/** + Minimise the internal storage used by a path. + + As paths are constructed, the internal buffers + grow. To avoid repeated reallocations they + grow with some spare space. Once a path has + been fully constructed, this call allows the + excess space to be trimmed. +*/ +void fz_trim_path(fz_context *ctx, fz_path *path); + +/** + Return the number of bytes required to pack a path. +*/ +int fz_packed_path_size(const fz_path *path); + +/** + Pack a path into the given block. + To minimise the size of paths, this function allows them to be + packed into a buffer with other information. Paths can be used + interchangeably regardless of how they are packed. + + pack: Pointer to a block of data to pack the path into. Should + be aligned by the caller to the same alignment as required for + a fz_path pointer. + + path: The path to pack. + + Returns the number of bytes within the block used. Callers can + access the packed path data by casting the value of pack on + entry to be a fz_path *. + + Throws exceptions on failure to allocate. + + Implementation details: Paths can be 'unpacked', 'flat', or + 'open'. Standard paths, as created are 'unpacked'. Paths + will be packed as 'flat', unless they are too large + (where large indicates that they exceed some private + implementation defined limits, currently including having + more than 256 coordinates or commands). + + Large paths are 'open' packed as a header into the given block, + plus pointers to other data blocks. + + Users should not have to care about whether paths are 'open' + or 'flat' packed. Simply pack a path (if required), and then + forget about the details. +*/ +size_t fz_pack_path(fz_context *ctx, uint8_t *pack, const fz_path *path); + +/** + Clone the data for a path. + + This is used in preference to fz_keep_path when a whole + new copy of a path is required, rather than just a shared + pointer. This probably indicates that the path is about to + be modified. + + path: path to clone. + + Throws exceptions on failure to allocate. +*/ +fz_path *fz_clone_path(fz_context *ctx, fz_path *path); + +/** + Return the current point that a path has + reached or (0,0) if empty. + + path: path to return the current point of. +*/ +fz_point fz_currentpoint(fz_context *ctx, fz_path *path); + +/** + Append a 'moveto' command to a path. + This 'opens' a path. + + path: The path to modify. + + x, y: The coordinate to move to. + + Throws exceptions on failure to allocate, or attempting to + modify a packed path. +*/ +void fz_moveto(fz_context *ctx, fz_path *path, float x, float y); + +/** + Append a 'lineto' command to an open path. + + path: The path to modify. + + x, y: The coordinate to line to. + + Throws exceptions on failure to allocate, or attempting to + modify a packed path. +*/ +void fz_lineto(fz_context *ctx, fz_path *path, float x, float y); + +/** + Append a 'rectto' command to an open path. + + The rectangle is equivalent to: + moveto x0 y0 + lineto x1 y0 + lineto x1 y1 + lineto x0 y1 + closepath + + path: The path to modify. + + x0, y0: First corner of the rectangle. + + x1, y1: Second corner of the rectangle. + + Throws exceptions on failure to allocate, or attempting to + modify a packed path. +*/ +void fz_rectto(fz_context *ctx, fz_path *path, float x0, float y0, float x1, float y1); + +/** + Append a 'quadto' command to an open path. (For a + quadratic bezier). + + path: The path to modify. + + x0, y0: The control coordinates for the quadratic curve. + + x1, y1: The end coordinates for the quadratic curve. + + Throws exceptions on failure to allocate, or attempting to + modify a packed path. +*/ +void fz_quadto(fz_context *ctx, fz_path *path, float x0, float y0, float x1, float y1); + +/** + Append a 'curveto' command to an open path. (For a + cubic bezier). + + path: The path to modify. + + x0, y0: The coordinates of the first control point for the + curve. + + x1, y1: The coordinates of the second control point for the + curve. + + x2, y2: The end coordinates for the curve. + + Throws exceptions on failure to allocate, or attempting to + modify a packed path. +*/ +void fz_curveto(fz_context *ctx, fz_path *path, float x0, float y0, float x1, float y1, float x2, float y2); + +/** + Append a 'curvetov' command to an open path. (For a + cubic bezier with the first control coordinate equal to + the start point). + + path: The path to modify. + + x1, y1: The coordinates of the second control point for the + curve. + + x2, y2: The end coordinates for the curve. + + Throws exceptions on failure to allocate, or attempting to + modify a packed path. +*/ +void fz_curvetov(fz_context *ctx, fz_path *path, float x1, float y1, float x2, float y2); + +/** + Append a 'curvetoy' command to an open path. (For a + cubic bezier with the second control coordinate equal to + the end point). + + path: The path to modify. + + x0, y0: The coordinates of the first control point for the + curve. + + x2, y2: The end coordinates for the curve (and the second + control coordinate). + + Throws exceptions on failure to allocate, or attempting to + modify a packed path. +*/ +void fz_curvetoy(fz_context *ctx, fz_path *path, float x0, float y0, float x2, float y2); + +/** + Close the current subpath. + + path: The path to modify. + + Throws exceptions on failure to allocate, attempting to modify + a packed path, and illegal path closes (i.e. closing a non open + path). +*/ +void fz_closepath(fz_context *ctx, fz_path *path); + +/** + Transform a path by a given + matrix. + + path: The path to modify (must not be a packed path). + + transform: The transform to apply. + + Throws exceptions if the path is packed, or on failure + to allocate. +*/ +void fz_transform_path(fz_context *ctx, fz_path *path, fz_matrix transform); + +/** + Return a bounding rectangle for a path. + + path: The path to bound. + + stroke: If NULL, the bounding rectangle given is for + the filled path. If non-NULL the bounding rectangle + given is for the path stroked with the given attributes. + + ctm: The matrix to apply to the path during stroking. + + r: Pointer to a fz_rect which will be used to hold + the result. + + Returns r, updated to contain the bounding rectangle. +*/ +fz_rect fz_bound_path(fz_context *ctx, const fz_path *path, const fz_stroke_state *stroke, fz_matrix ctm); + +/** + Given a rectangle (assumed to be the bounding box for a path), + expand it to allow for the expansion of the bbox that would be + seen by stroking the path with the given stroke state and + transform. +*/ +fz_rect fz_adjust_rect_for_stroke(fz_context *ctx, fz_rect rect, const fz_stroke_state *stroke, fz_matrix ctm); + +/** + A sane 'default' stroke state. +*/ +FZ_DATA extern const fz_stroke_state fz_default_stroke_state; + +/** + Create a new (empty) stroke state structure (with no dash + data) and return a reference to it. + + Throws exception on failure to allocate. +*/ +fz_stroke_state *fz_new_stroke_state(fz_context *ctx); + +/** + Create a new (empty) stroke state structure, with room for + dash data of the given length, and return a reference to it. + + len: The number of dash elements to allow room for. + + Throws exception on failure to allocate. +*/ +fz_stroke_state *fz_new_stroke_state_with_dash_len(fz_context *ctx, int len); + +/** + Take an additional reference to a stroke state structure. + + No modifications should be carried out on a stroke + state to which more than one reference is held, as + this can cause race conditions. +*/ +fz_stroke_state *fz_keep_stroke_state(fz_context *ctx, const fz_stroke_state *stroke); + +/** + Drop a reference to a stroke state structure, destroying the + structure if it is the last reference. +*/ +void fz_drop_stroke_state(fz_context *ctx, const fz_stroke_state *stroke); + +/** + Given a reference to a (possibly) shared stroke_state structure, + return a reference to an equivalent stroke_state structure + that is guaranteed to be unshared (i.e. one that can + safely be modified). + + shared: The reference to a (possibly) shared structure + to unshare. Ownership of this reference is passed in + to this function, even in the case of exceptions being + thrown. + + Exceptions may be thrown in the event of failure to + allocate if required. +*/ +fz_stroke_state *fz_unshare_stroke_state(fz_context *ctx, fz_stroke_state *shared); + +/** + Given a reference to a (possibly) shared stroke_state structure, + return a reference to a stroke_state structure (with room for a + given amount of dash data) that is guaranteed to be unshared + (i.e. one that can safely be modified). + + shared: The reference to a (possibly) shared structure + to unshare. Ownership of this reference is passed in + to this function, even in the case of exceptions being + thrown. + + Exceptions may be thrown in the event of failure to + allocate if required. +*/ +fz_stroke_state *fz_unshare_stroke_state_with_dash_len(fz_context *ctx, fz_stroke_state *shared, int len); + +/** + Create an identical stroke_state structure and return a + reference to it. + + stroke: The stroke state reference to clone. + + Exceptions may be thrown in the event of a failure to + allocate. +*/ +fz_stroke_state *fz_clone_stroke_state(fz_context *ctx, fz_stroke_state *stroke); + +#endif diff --git a/include/mupdf/fitz/pixmap.h b/include/mupdf/fitz/pixmap.h new file mode 100644 index 0000000..34460de --- /dev/null +++ b/include/mupdf/fitz/pixmap.h @@ -0,0 +1,474 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_PIXMAP_H +#define MUPDF_FITZ_PIXMAP_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/geometry.h" +#include "mupdf/fitz/store.h" +#include "mupdf/fitz/separation.h" + +/** + Pixmaps represent a set of pixels for a 2 dimensional region of + a plane. Each pixel has n components per pixel. The components + are in the order process-components, spot-colors, alpha, where + there can be 0 of any of those types. The data is in + premultiplied alpha when rendering, but non-premultiplied for + colorspace conversions and rescaling. +*/ + +typedef struct fz_overprint fz_overprint; + +/** + Return the bounding box for a pixmap. +*/ +fz_irect fz_pixmap_bbox(fz_context *ctx, const fz_pixmap *pix); + +/** + Return the width of the pixmap in pixels. +*/ +int fz_pixmap_width(fz_context *ctx, const fz_pixmap *pix); + +/** + Return the height of the pixmap in pixels. +*/ +int fz_pixmap_height(fz_context *ctx, const fz_pixmap *pix); + +/** + Return the x value of the pixmap in pixels. +*/ +int fz_pixmap_x(fz_context *ctx, const fz_pixmap *pix); + +/** + Return the y value of the pixmap in pixels. +*/ +int fz_pixmap_y(fz_context *ctx, const fz_pixmap *pix); + +/** + Create a new pixmap, with its origin at (0,0) + + cs: The colorspace to use for the pixmap, or NULL for an alpha + plane/mask. + + w: The width of the pixmap (in pixels) + + h: The height of the pixmap (in pixels) + + seps: Details of separations. + + alpha: 0 for no alpha, 1 for alpha. + + Returns a pointer to the new pixmap. Throws exception on failure + to allocate. +*/ +fz_pixmap *fz_new_pixmap(fz_context *ctx, fz_colorspace *cs, int w, int h, fz_separations *seps, int alpha); + +/** + Create a pixmap of a given size, location and pixel format. + + The bounding box specifies the size of the created pixmap and + where it will be located. The colorspace determines the number + of components per pixel. Alpha is always present. Pixmaps are + reference counted, so drop references using fz_drop_pixmap. + + colorspace: Colorspace format used for the created pixmap. The + pixmap will keep a reference to the colorspace. + + bbox: Bounding box specifying location/size of created pixmap. + + seps: Details of separations. + + alpha: 0 for no alpha, 1 for alpha. + + Returns a pointer to the new pixmap. Throws exception on failure + to allocate. +*/ +fz_pixmap *fz_new_pixmap_with_bbox(fz_context *ctx, fz_colorspace *colorspace, fz_irect bbox, fz_separations *seps, int alpha); + +/** + Create a new pixmap, with its origin at + (0,0) using the supplied data block. + + cs: The colorspace to use for the pixmap, or NULL for an alpha + plane/mask. + + w: The width of the pixmap (in pixels) + + h: The height of the pixmap (in pixels) + + seps: Details of separations. + + alpha: 0 for no alpha, 1 for alpha. + + stride: The byte offset from the pixel data in a row to the + pixel data in the next row. + + samples: The data block to keep the samples in. + + Returns a pointer to the new pixmap. Throws exception on failure to + allocate. +*/ +fz_pixmap *fz_new_pixmap_with_data(fz_context *ctx, fz_colorspace *colorspace, int w, int h, fz_separations *seps, int alpha, int stride, unsigned char *samples); + +/** + Create a pixmap of a given size, location and pixel format, + using the supplied data block. + + The bounding box specifies the size of the created pixmap and + where it will be located. The colorspace determines the number + of components per pixel. Alpha is always present. Pixmaps are + reference counted, so drop references using fz_drop_pixmap. + + colorspace: Colorspace format used for the created pixmap. The + pixmap will keep a reference to the colorspace. + + rect: Bounding box specifying location/size of created pixmap. + + seps: Details of separations. + + alpha: Number of alpha planes (0 or 1). + + samples: The data block to keep the samples in. + + Returns a pointer to the new pixmap. Throws exception on failure + to allocate. +*/ +fz_pixmap *fz_new_pixmap_with_bbox_and_data(fz_context *ctx, fz_colorspace *colorspace, fz_irect rect, fz_separations *seps, int alpha, unsigned char *samples); + +/** + Create a new pixmap that represents a subarea of the specified + pixmap. A reference is taken to this pixmap that will be dropped + on destruction. + + The supplied rectangle must be wholly contained within the + original pixmap. + + Returns a pointer to the new pixmap. Throws exception on failure + to allocate. +*/ +fz_pixmap *fz_new_pixmap_from_pixmap(fz_context *ctx, fz_pixmap *pixmap, const fz_irect *rect); + +/** + Clone a pixmap, copying the pixels and associated data to new + storage. + + The reference count of 'old' is unchanged. +*/ +fz_pixmap *fz_clone_pixmap(fz_context *ctx, const fz_pixmap *old); + +/** + Increment the reference count for the pixmap. The same pointer + is returned. + + Never throws exceptions. +*/ +fz_pixmap *fz_keep_pixmap(fz_context *ctx, fz_pixmap *pix); + +/** + Decrement the reference count for the pixmap. When the + reference count hits 0, the pixmap is freed. + + Never throws exceptions. +*/ +void fz_drop_pixmap(fz_context *ctx, fz_pixmap *pix); + +/** + Return the colorspace of a pixmap + + Returns colorspace. +*/ +fz_colorspace *fz_pixmap_colorspace(fz_context *ctx, const fz_pixmap *pix); + +/** + Return the number of components in a pixmap. + + Returns the number of components (including spots and alpha). +*/ +int fz_pixmap_components(fz_context *ctx, const fz_pixmap *pix); + +/** + Return the number of colorants in a pixmap. + + Returns the number of colorants (components, less any spots and + alpha). +*/ +int fz_pixmap_colorants(fz_context *ctx, const fz_pixmap *pix); + +/** + Return the number of spots in a pixmap. + + Returns the number of spots (components, less colorants and + alpha). Does not throw exceptions. +*/ +int fz_pixmap_spots(fz_context *ctx, const fz_pixmap *pix); + +/** + Return the number of alpha planes in a pixmap. + + Returns the number of alphas. Does not throw exceptions. +*/ +int fz_pixmap_alpha(fz_context *ctx, const fz_pixmap *pix); + +/** + Returns a pointer to the pixel data of a pixmap. + + Returns the pointer. +*/ +unsigned char *fz_pixmap_samples(fz_context *ctx, const fz_pixmap *pix); + +/** + Return the number of bytes in a row in the pixmap. +*/ +int fz_pixmap_stride(fz_context *ctx, const fz_pixmap *pix); + +/** + Set the pixels per inch resolution of the pixmap. +*/ +void fz_set_pixmap_resolution(fz_context *ctx, fz_pixmap *pix, int xres, int yres); + +/** + Clears a pixmap with the given value. + + pix: The pixmap to clear. + + value: Values in the range 0 to 255 are valid. Each component + sample for each pixel in the pixmap will be set to this value, + while alpha will always be set to 255 (non-transparent). + + This function is horrible, and should be removed from the + API and replaced with a less magic one. +*/ +void fz_clear_pixmap_with_value(fz_context *ctx, fz_pixmap *pix, int value); + +/** + Fill pixmap with solid color. +*/ +void fz_fill_pixmap_with_color(fz_context *ctx, fz_pixmap *pix, fz_colorspace *colorspace, float *color, fz_color_params color_params); + +/** + Clears a subrect of a pixmap with the given value. + + pix: The pixmap to clear. + + value: Values in the range 0 to 255 are valid. Each component + sample for each pixel in the pixmap will be set to this value, + while alpha will always be set to 255 (non-transparent). + + r: the rectangle. +*/ +void fz_clear_pixmap_rect_with_value(fz_context *ctx, fz_pixmap *pix, int value, fz_irect r); + +/** + Sets all components (including alpha) of + all pixels in a pixmap to 0. + + pix: The pixmap to clear. +*/ +void fz_clear_pixmap(fz_context *ctx, fz_pixmap *pix); + +/** + Invert all the pixels in a pixmap. All components (process and + spots) of all pixels are inverted (except alpha, which is + unchanged). +*/ +void fz_invert_pixmap(fz_context *ctx, fz_pixmap *pix); + +/** + Invert the alpha fo all the pixels in a pixmap. +*/ +void fz_invert_pixmap_alpha(fz_context *ctx, fz_pixmap *pix); + +/** + Transform the pixels in a pixmap so that luminance of each + pixel is inverted, and the chrominance remains unchanged (as + much as accuracy allows). + + All components of all pixels are inverted (except alpha, which + is unchanged). Only supports Grey and RGB bitmaps. +*/ +void fz_invert_pixmap_luminance(fz_context *ctx, fz_pixmap *pix); + +/** + Tint all the pixels in an RGB, BGR, or Gray pixmap. + + black: Map black to this hexadecimal RGB color. + + white: Map white to this hexadecimal RGB color. +*/ +void fz_tint_pixmap(fz_context *ctx, fz_pixmap *pix, int black, int white); + +/** + Invert all the pixels in a given rectangle of a (premultiplied) + pixmap. All components of all pixels in the rectangle are + inverted (except alpha, which is unchanged). +*/ +void fz_invert_pixmap_rect(fz_context *ctx, fz_pixmap *image, fz_irect rect); + +/** + Invert all the pixels in a non-premultiplied pixmap in a + very naive manner. +*/ +void fz_invert_pixmap_raw(fz_context *ctx, fz_pixmap *pix); + +/** + Apply gamma correction to a pixmap. All components + of all pixels are modified (except alpha, which is unchanged). + + gamma: The gamma value to apply; 1.0 for no change. +*/ +void fz_gamma_pixmap(fz_context *ctx, fz_pixmap *pix, float gamma); + +/** + Convert an existing pixmap to a desired + colorspace. Other properties of the pixmap, such as resolution + and position are copied to the converted pixmap. + + pix: The pixmap to convert. + + default_cs: If NULL pix->colorspace is used. It is possible that + the data may need to be interpreted as one of the color spaces + in default_cs. + + cs_des: Desired colorspace, may be NULL to denote alpha-only. + + prf: Proofing color space through which we need to convert. + + color_params: Parameters that may be used in conversion (e.g. + ri). + + keep_alpha: If 0 any alpha component is removed, otherwise + alpha is kept if present in the pixmap. +*/ +fz_pixmap *fz_convert_pixmap(fz_context *ctx, const fz_pixmap *pix, fz_colorspace *cs_des, fz_colorspace *prf, fz_default_colorspaces *default_cs, fz_color_params color_params, int keep_alpha); + +/** + Check if the pixmap is a 1-channel image containing samples with + only values 0 and 255 +*/ +int fz_is_pixmap_monochrome(fz_context *ctx, fz_pixmap *pixmap); + +/* Implementation details: subject to change.*/ + +fz_pixmap *fz_alpha_from_gray(fz_context *ctx, fz_pixmap *gray); +void fz_decode_tile(fz_context *ctx, fz_pixmap *pix, const float *decode); +void fz_md5_pixmap(fz_context *ctx, fz_pixmap *pixmap, unsigned char digest[16]); + +fz_stream * +fz_unpack_stream(fz_context *ctx, fz_stream *src, int depth, int w, int h, int n, int indexed, int pad, int skip); + +/** + Pixmaps represent a set of pixels for a 2 dimensional region of + a plane. Each pixel has n components per pixel. The components + are in the order process-components, spot-colors, alpha, where + there can be 0 of any of those types. The data is in + premultiplied alpha when rendering, but non-premultiplied for + colorspace conversions and rescaling. + + x, y: The minimum x and y coord of the region in pixels. + + w, h: The width and height of the region in pixels. + + n: The number of color components in the image. + n = num composite colors + num spots + num alphas + + s: The number of spot channels in the image. + + alpha: 0 for no alpha, 1 for alpha present. + + flags: flag bits. + Bit 0: If set, draw the image with linear interpolation. + Bit 1: If set, free the samples buffer when the pixmap + is destroyed. + + stride: The byte offset from the data for any given pixel + to the data for the same pixel on the row below. + + seps: NULL, or a pointer to a separations structure. If NULL, + s should be 0. + + xres, yres: Image resolution in dpi. Default is 96 dpi. + + colorspace: Pointer to a colorspace object describing the + colorspace the pixmap is in. If NULL, the image is a mask. + + samples: Pointer to the first byte of the pixmap sample data. + This is typically a simple block of memory w * h * n bytes of + memory in which the components are stored linearly, but with the + use of appropriate stride values, scanlines can be stored in + different orders, and have different amounts of padding. The + first n bytes are components 0 to n-1 for the pixel at (x,y). + Each successive n bytes gives another pixel in scanline order + as we move across the line. The start of each scanline is offset + the start of the previous one by stride bytes. +*/ +struct fz_pixmap +{ + fz_storable storable; + int x, y, w, h; + unsigned char n; + unsigned char s; + unsigned char alpha; + unsigned char flags; + ptrdiff_t stride; + fz_separations *seps; + int xres, yres; + fz_colorspace *colorspace; + unsigned char *samples; + fz_pixmap *underlying; +}; + +enum +{ + FZ_PIXMAP_FLAG_INTERPOLATE = 1, + FZ_PIXMAP_FLAG_FREE_SAMPLES = 2 +}; + +/* Create a new pixmap from a warped section of another. + * + * Colorspace, resolution etc are inherited from the original. + * points give the corner points within the original pixmap of a + * (convex) quadrilateral. These corner points will be 'warped' to be + * the corner points of the returned bitmap, which will have the given + * width/height. + */ +fz_pixmap * +fz_warp_pixmap(fz_context *ctx, fz_pixmap *src, const fz_point points[4], int width, int height); + +/* + Convert between different separation results. +*/ +fz_pixmap *fz_clone_pixmap_area_with_different_seps(fz_context *ctx, fz_pixmap *src, const fz_irect *bbox, fz_colorspace *dcs, fz_separations *seps, fz_color_params color_params, fz_default_colorspaces *default_cs); + +/* + * Extract alpha channel as a separate pixmap. + * Returns NULL if there is no alpha channel in the source. + */ +fz_pixmap *fz_new_pixmap_from_alpha_channel(fz_context *ctx, fz_pixmap *src); + +/* + * Combine a pixmap without an alpha channel with a soft mask. + */ +fz_pixmap *fz_new_pixmap_from_color_and_mask(fz_context *ctx, fz_pixmap *color, fz_pixmap *mask); + +#endif diff --git a/include/mupdf/fitz/pool.h b/include/mupdf/fitz/pool.h new file mode 100644 index 0000000..43bc6b2 --- /dev/null +++ b/include/mupdf/fitz/pool.h @@ -0,0 +1,68 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_POOL_H +#define MUPDF_FITZ_POOL_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" + +/** + Simple pool allocators. + + Allocate from the pool, which can then be freed at once. +*/ +typedef struct fz_pool fz_pool; + +/** + Create a new pool to allocate from. +*/ +fz_pool *fz_new_pool(fz_context *ctx); + +/** + Allocate a block of size bytes from the pool. +*/ +void *fz_pool_alloc(fz_context *ctx, fz_pool *pool, size_t size); + +/** + strdup equivalent allocating from the pool. +*/ +char *fz_pool_strdup(fz_context *ctx, fz_pool *pool, const char *s); + +/** + The current size of the pool. + + The number of bytes of storage currently allocated to the pool. + This is the total of the storage used for the blocks making + up the pool, rather then total of the allocated blocks so far, + so it will increase in 'lumps'. + from the pool, then the pool size may still be X +*/ +size_t fz_pool_size(fz_context *ctx, fz_pool *pool); + +/** + Drop a pool, freeing and invalidating all storage returned from + the pool. +*/ +void fz_drop_pool(fz_context *ctx, fz_pool *pool); + +#endif diff --git a/include/mupdf/fitz/separation.h b/include/mupdf/fitz/separation.h new file mode 100644 index 0000000..0a6e2fd --- /dev/null +++ b/include/mupdf/fitz/separation.h @@ -0,0 +1,138 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_SEPARATION_H +#define MUPDF_FITZ_SEPARATION_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/color.h" + +/** + A fz_separation structure holds details of a set of separations + (such as might be used on within a page of the document). + + The app might control the separations by enabling/disabling them, + and subsequent renders would take this into account. +*/ + +enum +{ + FZ_MAX_SEPARATIONS = 64 +}; + +typedef struct fz_separations fz_separations; + +typedef enum +{ + /* "Composite" separations are rendered using process + * colors using the equivalent colors */ + FZ_SEPARATION_COMPOSITE = 0, + /* Spot colors are rendered into their own spot plane. */ + FZ_SEPARATION_SPOT = 1, + /* Disabled colors are not rendered at all in the final + * output. */ + FZ_SEPARATION_DISABLED = 2 +} fz_separation_behavior; + +/** + Create a new separations structure (initially empty) +*/ +fz_separations *fz_new_separations(fz_context *ctx, int controllable); + +/** + Increment the reference count for a separations structure. + Returns the same pointer. + + Never throws exceptions. +*/ +fz_separations *fz_keep_separations(fz_context *ctx, fz_separations *sep); + +/** + Decrement the reference count for a separations structure. + When the reference count hits zero, the separations structure + is freed. + + Never throws exceptions. +*/ +void fz_drop_separations(fz_context *ctx, fz_separations *sep); + +/** + Add a separation (null terminated name, colorspace) +*/ +void fz_add_separation(fz_context *ctx, fz_separations *sep, const char *name, fz_colorspace *cs, int cs_channel); + +/** + Add a separation with equivalents (null terminated name, + colorspace) + + (old, deprecated) +*/ +void fz_add_separation_equivalents(fz_context *ctx, fz_separations *sep, uint32_t rgba, uint32_t cmyk, const char *name); + +/** + Control the rendering of a given separation. +*/ +void fz_set_separation_behavior(fz_context *ctx, fz_separations *sep, int separation, fz_separation_behavior behavior); + +/** + Test for the current behavior of a separation. +*/ +fz_separation_behavior fz_separation_current_behavior(fz_context *ctx, const fz_separations *sep, int separation); + +const char *fz_separation_name(fz_context *ctx, const fz_separations *sep, int separation); +int fz_count_separations(fz_context *ctx, const fz_separations *sep); + +/** + Return the number of active separations. +*/ +int fz_count_active_separations(fz_context *ctx, const fz_separations *seps); + +/** + Compare 2 separations structures (or NULLs). + + Return 0 if identical, non-zero if not identical. +*/ +int fz_compare_separations(fz_context *ctx, const fz_separations *sep1, const fz_separations *sep2); + + +/** + Return a separations object with all the spots in the input + separations object that are set to composite, reset to be + enabled. If there ARE no spots in the object, this returns + NULL. If the object already has all its spots enabled, then + just returns another handle on the same object. +*/ +fz_separations *fz_clone_separations_for_overprint(fz_context *ctx, fz_separations *seps); + +/** + Convert a color given in terms of one colorspace, + to a color in terms of another colorspace/separations. +*/ +void fz_convert_separation_colors(fz_context *ctx, fz_colorspace *src_cs, const float *src_color, fz_separations *dst_seps, fz_colorspace *dst_cs, float *dst_color, fz_color_params color_params); + +/** + Get the equivalent separation color in a given colorspace. +*/ +void fz_separation_equivalent(fz_context *ctx, const fz_separations *seps, int idx, fz_colorspace *dst_cs, float *dst_color, fz_colorspace *prf, fz_color_params color_params); + +#endif diff --git a/include/mupdf/fitz/shade.h b/include/mupdf/fitz/shade.h new file mode 100644 index 0000000..8acde99 --- /dev/null +++ b/include/mupdf/fitz/shade.h @@ -0,0 +1,231 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_SHADE_H +#define MUPDF_FITZ_SHADE_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/geometry.h" +#include "mupdf/fitz/store.h" +#include "mupdf/fitz/pixmap.h" +#include "mupdf/fitz/compressed-buffer.h" + +/** + * The shading code uses gouraud shaded triangle meshes. + */ + +enum +{ + FZ_FUNCTION_BASED = 1, + FZ_LINEAR = 2, + FZ_RADIAL = 3, + FZ_MESH_TYPE4 = 4, + FZ_MESH_TYPE5 = 5, + FZ_MESH_TYPE6 = 6, + FZ_MESH_TYPE7 = 7 +}; + +/** + Structure is public to allow derived classes. Do not + access the members directly. +*/ +typedef struct +{ + fz_storable storable; + + fz_rect bbox; /* can be fz_infinite_rect */ + fz_colorspace *colorspace; + + fz_matrix matrix; /* matrix from pattern dict */ + int use_background; /* background color for fills but not 'sh' */ + float background[FZ_MAX_COLORS]; + + /* Just to be confusing, PDF Shadings of Type 1 (Function Based + * Shadings), do NOT use_function, but all the others do. This + * is because Type 1 shadings take 2 inputs, whereas all the + * others (when used with a function take 1 input. The type 1 + * data is in the 'f' field of the union below. */ + int use_function; + float function[256][FZ_MAX_COLORS + 1]; + + int type; /* function, linear, radial, mesh */ + union + { + struct + { + int extend[2]; + float coords[2][3]; /* (x,y,r) twice */ + } l_or_r; + struct + { + int vprow; + int bpflag; + int bpcoord; + int bpcomp; + float x0, x1; + float y0, y1; + float c0[FZ_MAX_COLORS]; + float c1[FZ_MAX_COLORS]; + } m; + struct + { + fz_matrix matrix; + int xdivs; + int ydivs; + float domain[2][2]; + float *fn_vals; + } f; + } u; + + fz_compressed_buffer *buffer; +} fz_shade; + +/** + Increment the reference count for the shade structure. The + same pointer is returned. + + Never throws exceptions. +*/ +fz_shade *fz_keep_shade(fz_context *ctx, fz_shade *shade); + +/** + Decrement the reference count for the shade structure. When + the reference count hits zero, the structure is freed. + + Never throws exceptions. +*/ +void fz_drop_shade(fz_context *ctx, fz_shade *shade); + +/** + Bound a given shading. + + shade: The shade to bound. + + ctm: The transform to apply to the shade before bounding. + + r: Pointer to storage to put the bounds in. + + Returns r, updated to contain the bounds for the shading. +*/ +fz_rect fz_bound_shade(fz_context *ctx, fz_shade *shade, fz_matrix ctm); + +typedef struct fz_shade_color_cache fz_shade_color_cache; + +void fz_drop_shade_color_cache(fz_context *ctx, fz_shade_color_cache *cache); + +/** + Render a shade to a given pixmap. + + shade: The shade to paint. + + override_cs: NULL, or colorspace to override the shades + inbuilt colorspace. + + ctm: The transform to apply. + + dest: The pixmap to render into. + + color_params: The color rendering settings + + bbox: Pointer to a bounding box to limit the rendering + of the shade. + + eop: NULL, or pointer to overprint bitmap. + + cache: *cache is used to cache color information. If *cache is NULL it + is set to point to a new fz_shade_color_cache. If cache is NULL it is + ignored. +*/ +void fz_paint_shade(fz_context *ctx, fz_shade *shade, fz_colorspace *override_cs, fz_matrix ctm, fz_pixmap *dest, fz_color_params color_params, fz_irect bbox, const fz_overprint *eop, fz_shade_color_cache **cache); + +/** + * Handy routine for processing mesh based shades + */ +typedef struct +{ + fz_point p; + float c[FZ_MAX_COLORS]; +} fz_vertex; + +/** + Callback function type for use with + fz_process_shade. + + arg: Opaque pointer from fz_process_shade caller. + + v: Pointer to a fz_vertex structure to populate. + + c: Pointer to an array of floats used to populate v. +*/ +typedef void (fz_shade_prepare_fn)(fz_context *ctx, void *arg, fz_vertex *v, const float *c); + +/** + Callback function type for use with + fz_process_shade. + + arg: Opaque pointer from fz_process_shade caller. + + av, bv, cv: Pointers to a fz_vertex structure describing + the corner locations and colors of a triangle to be + filled. +*/ +typedef void (fz_shade_process_fn)(fz_context *ctx, void *arg, fz_vertex *av, fz_vertex *bv, fz_vertex *cv); + +/** + Process a shade, using supplied callback functions. This + decomposes the shading to a mesh (even ones that are not + natively meshes, such as linear or radial shadings), and + processes triangles from those meshes. + + shade: The shade to process. + + ctm: The transform to use + + prepare: Callback function to 'prepare' each vertex. + This function is passed an array of floats, and populates + a fz_vertex structure. + + process: This function is passed 3 pointers to vertex + structures, and actually performs the processing (typically + filling the area between the vertexes). + + process_arg: An opaque argument passed through from caller + to callback functions. +*/ +void fz_process_shade(fz_context *ctx, fz_shade *shade, fz_matrix ctm, fz_rect scissor, + fz_shade_prepare_fn *prepare, + fz_shade_process_fn *process, + void *process_arg); + + +/* Implementation details: subject to change. */ + +/** + Internal function to destroy a + shade. Only exposed for use with the fz_store. + + shade: The reference to destroy. +*/ +void fz_drop_shade_imp(fz_context *ctx, fz_storable *shade); + +#endif diff --git a/include/mupdf/fitz/store.h b/include/mupdf/fitz/store.h new file mode 100644 index 0000000..027c145 --- /dev/null +++ b/include/mupdf/fitz/store.h @@ -0,0 +1,442 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_STORE_H +#define MUPDF_FITZ_STORE_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/output.h" +#include "mupdf/fitz/log.h" + +/** + Resource store + + MuPDF stores decoded "objects" into a store for potential reuse. + If the size of the store gets too big, objects stored within it + can be evicted and freed to recover space. When MuPDF comes to + decode such an object, it will check to see if a version of this + object is already in the store - if it is, it will simply reuse + it. If not, it will decode it and place it into the store. + + All objects that can be placed into the store are derived from + the fz_storable type (i.e. this should be the first component of + the objects structure). This allows for consistent (thread safe) + reference counting, and includes a function that will be called + to free the object as soon as the reference count reaches zero. + + Most objects offer fz_keep_XXXX/fz_drop_XXXX functions derived + from fz_keep_storable/fz_drop_storable. Creation of such objects + includes a call to FZ_INIT_STORABLE to set up the fz_storable + header. + */ +typedef struct fz_storable fz_storable; + +/** + Function type for a function to drop a storable object. + + Objects within the store are identified by type by comparing + their drop_fn pointers. +*/ +typedef void (fz_store_drop_fn)(fz_context *, fz_storable *); + +/** + Any storable object should include an fz_storable structure + at the start (by convention at least) of their structure. + (Unless it starts with an fz_key_storable, see below). +*/ +struct fz_storable { + int refs; + fz_store_drop_fn *drop; +}; + +/** + Any storable object that can appear in the key of another + storable object should include an fz_key_storable structure + at the start (by convention at least) of their structure. +*/ +typedef struct +{ + fz_storable storable; + short store_key_refs; +} fz_key_storable; + +/** + Macro to initialise a storable object. +*/ +#define FZ_INIT_STORABLE(S_,RC,DROP) \ + do { fz_storable *S = &(S_)->storable; S->refs = (RC); \ + S->drop = (DROP); \ + } while (0) + +/** + Macro to initialise a key storable object. +*/ +#define FZ_INIT_KEY_STORABLE(KS_,RC,DROP) \ + do { fz_key_storable *KS = &(KS_)->key_storable; KS->store_key_refs = 0;\ + FZ_INIT_STORABLE(KS,RC,DROP); \ + } while (0) + +/** + Increment the reference count for a storable object. + Returns the same pointer. + + Never throws exceptions. +*/ +void *fz_keep_storable(fz_context *, const fz_storable *); + +/** + Decrement the reference count for a storable object. When the + reference count hits zero, the drop function for that object + is called to free the object. + + Never throws exceptions. +*/ +void fz_drop_storable(fz_context *, const fz_storable *); + +/** + Increment the (normal) reference count for a key storable + object. Returns the same pointer. + + Never throws exceptions. +*/ +void *fz_keep_key_storable(fz_context *, const fz_key_storable *); + +/** + Decrement the (normal) reference count for a storable object. + When the total reference count hits zero, the drop function for + that object is called to free the object. + + Never throws exceptions. +*/ +void fz_drop_key_storable(fz_context *, const fz_key_storable *); + +/** + Increment the (key) reference count for a key storable + object. Returns the same pointer. + + Never throws exceptions. +*/ +void *fz_keep_key_storable_key(fz_context *, const fz_key_storable *); + +/** + Decrement the (key) reference count for a storable object. + When the total reference count hits zero, the drop function for + that object is called to free the object. + + Never throws exceptions. +*/ +void fz_drop_key_storable_key(fz_context *, const fz_key_storable *); + +/** + The store can be seen as a dictionary that maps keys to + fz_storable values. In order to allow keys of different types to + be stored, we have a structure full of functions for each key + 'type'; this fz_store_type pointer is stored with each key, and + tells the store how to perform certain operations (like taking/ + dropping a reference, comparing two keys, outputting details for + debugging etc). + + The store uses a hash table internally for speed where possible. + In order for this to work, we need a mechanism for turning a + generic 'key' into 'a hashable string'. For this purpose the + type structure contains a make_hash_key function pointer that + maps from a void * to a fz_store_hash structure. If + make_hash_key function returns 0, then the key is determined not + to be hashable, and the value is not stored in the hash table. + + Some objects can be used both as values within the store, and as + a component of keys within the store. We refer to these objects + as "key storable" objects. In this case, we need to take + additional care to ensure that we do not end up keeping an item + within the store, purely because its value is referred to by + another key in the store. + + An example of this are fz_images in PDF files. Each fz_image is + placed into the store to enable it to be easily reused. When the + image is rendered, a pixmap is generated from the image, and the + pixmap is placed into the store so it can be reused on + subsequent renders. The image forms part of the key for the + pixmap. + + When we close the pdf document (and any associated pages/display + lists etc), we drop the images from the store. This may leave us + in the position of the images having non-zero reference counts + purely because they are used as part of the keys for the + pixmaps. + + We therefore use special reference counting functions to keep + track of these "key storable" items, and hence store the number + of references to these items that are used in keys. + + When the number of references to an object == the number of + references to an object from keys in the store, we know that we + can remove all the items which have that object as part of the + key. This is done by running a pass over the store, 'reaping' + those items. + + Reap passes are slower than we would like as they touch every + item in the store. We therefore provide a way to 'batch' such + reap passes together, using fz_defer_reap_start/ + fz_defer_reap_end to bracket a region in which many may be + triggered. +*/ +typedef struct +{ + fz_store_drop_fn *drop; + union + { + struct + { + const void *ptr; + int i; + } pi; /* 8 or 12 bytes */ + struct + { + const void *ptr; + int i; + fz_irect r; + } pir; /* 24 or 28 bytes */ + struct + { + int id; + char has_shape; + char has_group_alpha; + float m[4]; + void *ptr; + } im; /* 28 or 32 bytes */ + struct + { + unsigned char src_md5[16]; + unsigned char dst_md5[16]; + unsigned int ri:2; + unsigned int bp:1; + unsigned int format:1; + unsigned int proof:1; + unsigned int src_extras:5; + unsigned int dst_extras:5; + unsigned int copy_spots:1; + unsigned int bgr:1; + } link; /* 36 bytes */ + } u; +} fz_store_hash; /* 40 or 44 bytes */ + +/** + Every type of object to be placed into the store defines an + fz_store_type. This contains the pointers to functions to + make hashes, manipulate keys, and check for needing reaping. +*/ +typedef struct +{ + const char *name; + int (*make_hash_key)(fz_context *ctx, fz_store_hash *hash, void *key); + void *(*keep_key)(fz_context *ctx, void *key); + void (*drop_key)(fz_context *ctx, void *key); + int (*cmp_key)(fz_context *ctx, void *a, void *b); + void (*format_key)(fz_context *ctx, char *buf, size_t size, void *key); + int (*needs_reap)(fz_context *ctx, void *key); +} fz_store_type; + +/** + Create a new store inside the context + + max: The maximum size (in bytes) that the store is allowed to + grow to. FZ_STORE_UNLIMITED means no limit. +*/ +void fz_new_store_context(fz_context *ctx, size_t max); + +/** + Increment the reference count for the store context. Returns + the same pointer. + + Never throws exceptions. +*/ +fz_store *fz_keep_store_context(fz_context *ctx); + +/** + Decrement the reference count for the store context. When the + reference count hits zero, the store context is freed. + + Never throws exceptions. +*/ +void fz_drop_store_context(fz_context *ctx); + +/** + Add an item to the store. + + Add an item into the store, returning NULL for success. If an + item with the same key is found in the store, then our item will + not be inserted, and the function will return a pointer to that + value instead. This function takes its own reference to val, as + required (i.e. the caller maintains ownership of its own + reference). + + key: The key used to index the item. + + val: The value to store. + + itemsize: The size in bytes of the value (as counted towards the + store size). + + type: Functions used to manipulate the key. +*/ +void *fz_store_item(fz_context *ctx, void *key, void *val, size_t itemsize, const fz_store_type *type); + +/** + Find an item within the store. + + drop: The function used to free the value (to ensure we get a + value of the correct type). + + key: The key used to index the item. + + type: Functions used to manipulate the key. + + Returns NULL for not found, otherwise returns a pointer to the + value indexed by key to which a reference has been taken. +*/ +void *fz_find_item(fz_context *ctx, fz_store_drop_fn *drop, void *key, const fz_store_type *type); + +/** + Remove an item from the store. + + If an item indexed by the given key exists in the store, remove + it. + + drop: The function used to free the value (to ensure we get a + value of the correct type). + + key: The key used to find the item to remove. + + type: Functions used to manipulate the key. +*/ +void fz_remove_item(fz_context *ctx, fz_store_drop_fn *drop, void *key, const fz_store_type *type); + +/** + Evict every item from the store. +*/ +void fz_empty_store(fz_context *ctx); + +/** + Internal function used as part of the scavenging + allocator; when we fail to allocate memory, before returning a + failure to the caller, we try to scavenge space within the store + by evicting at least 'size' bytes. The allocator then retries. + + size: The number of bytes we are trying to have free. + + phase: What phase of the scavenge we are in. Updated on exit. + + Returns non zero if we managed to free any memory. +*/ +int fz_store_scavenge(fz_context *ctx, size_t size, int *phase); + +/** + External function for callers to use + to scavenge while trying allocations. + + size: The number of bytes we are trying to have free. + + phase: What phase of the scavenge we are in. Updated on exit. + + Returns non zero if we managed to free any memory. +*/ +int fz_store_scavenge_external(fz_context *ctx, size_t size, int *phase); + +/** + Evict items from the store until the total size of + the objects in the store is reduced to a given percentage of its + current size. + + percent: %age of current size to reduce the store to. + + Returns non zero if we managed to free enough memory, zero + otherwise. +*/ +int fz_shrink_store(fz_context *ctx, unsigned int percent); + +/** + Callback function called by fz_filter_store on every item within + the store. + + Return 1 to drop the item from the store, 0 to retain. +*/ +typedef int (fz_store_filter_fn)(fz_context *ctx, void *arg, void *key); + +/** + Filter every element in the store with a matching type with the + given function. + + If the function returns 1 for an element, drop the element. +*/ +void fz_filter_store(fz_context *ctx, fz_store_filter_fn *fn, void *arg, const fz_store_type *type); + +/** + Output debugging information for the current state of the store + to the given output channel. +*/ +void fz_debug_store(fz_context *ctx, fz_output *out); + +/** + Increment the defer reap count. + + No reap operations will take place (except for those + triggered by an immediate failed malloc) until the + defer reap count returns to 0. + + Call this at the start of a process during which you + potentially might drop many reapable objects. + + It is vital that every fz_defer_reap_start is matched + by a fz_defer_reap_end call. +*/ +void fz_defer_reap_start(fz_context *ctx); + +/** + Decrement the defer reap count. + + If the defer reap count returns to 0, and the store + has reapable objects in, a reap pass will begin. + + Call this at the end of a process during which you + potentially might drop many reapable objects. + + It is vital that every fz_defer_reap_start is matched + by a fz_defer_reap_end call. +*/ +void fz_defer_reap_end(fz_context *ctx); + +#ifdef ENABLE_STORE_LOGGING + +void fz_log_dump_store(fz_context *ctx, const char *fmt, ...); + +#define FZ_LOG_STORE(CTX, ...) fz_log_module(CTX, "STORE", __VA_ARGS__) +#define FZ_LOG_DUMP_STORE(...) fz_log_dump_store(__VA_ARGS__) + +#else + +#define FZ_LOG_STORE(...) do {} while (0) +#define FZ_LOG_DUMP_STORE(...) do {} while (0) + +#endif + +#endif diff --git a/include/mupdf/fitz/story-writer.h b/include/mupdf/fitz/story-writer.h new file mode 100644 index 0000000..6ecc132 --- /dev/null +++ b/include/mupdf/fitz/story-writer.h @@ -0,0 +1,209 @@ +// Copyright (C) 2022 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_STORY_WRITER_H +#define MUPDF_FITZ_STORY_WRITER_H + +#include "mupdf/fitz/story.h" +#include "mupdf/fitz/writer.h" + +/* + * A fz_story_element_position plus page number information; used with + * fz_write_story() and fz_write_stabilized_story(). + */ +typedef struct +{ + fz_story_element_position element; + int page_num; +} fz_write_story_position; + +/* + * A set of fz_write_story_position items; used with + * fz_write_stabilized_story(). + */ +typedef struct +{ + fz_write_story_position *positions; + int num; +} fz_write_story_positions; + + +/* + * Callback type used by fz_write_story() and fz_write_stabilized_story(). + * + * Should set *rect to rect number . If this is on a new page should also + * set *mediabox and return 1, otherwise return 0. + * + * ref: + * As passed to fz_write_story() or fz_write_stabilized_story(). + * num: + * The rect number. Will typically increment by one each time, being reset + * to zero when fz_write_stabilized_story() starts a new iteration. + * filled: + * From earlier internal call to fz_place_story(). + * rect: + * Out param. + * ctm: + * Out param, defaults to fz_identity. + * mediabox: + * Out param, only used if we return 1. + */ +typedef int (fz_write_story_rectfn)(fz_context *ctx, void *ref, int num, fz_rect filled, fz_rect *rect, fz_matrix *ctm, fz_rect *mediabox); + +/* + * Callback used by fz_write_story() to report information about element + * positions. Slightly different from fz_story_position_callback() because + * also includes the page number. + * + * ref: + * As passed to fz_write_story() or fz_write_stabilized_story(). + * position: + * Called via internal call to fz_story_position_callback(). + */ +typedef void (fz_write_story_positionfn)(fz_context *ctx, void *ref, const fz_write_story_position *position); + +/* + * Callback for fz_write_story(), called twice for each page, before (after=0) + * and after (after=1) the story is written. + * + * ref: + * As passed to fz_write_story() or fz_write_stabilized_story(). + * page_num: + * Page number, starting from 1. + * mediabox: + * As returned from fz_write_story_rectfn(). + * dev: + * Created from the fz_writer passed to fz_write_story() or + * fz_write_stabilized_story(). + * after: + * 0 - before writing the story. + * 1 - after writing the story. + */ +typedef void (fz_write_story_pagefn)(fz_context *ctx, void *ref, int page_num, fz_rect mediabox, fz_device *dev, int after); + +/* + * Callback type for fz_write_stabilized_story(). + * + * Should populate the supplied buffer with html content for use with internal + * calls to fz_new_story(). This may include extra content derived from + * information in , for example a table of contents. + * + * ref: + * As passed to fz_write_stabilized_story(). + * positions: + * Information from previous iteration. + * buffer: + * Where to write the new content. Will be initially empty. + */ +typedef void (fz_write_story_contentfn)(fz_context *ctx, void *ref, const fz_write_story_positions *positions, fz_buffer *buffer); + + +/* + * Places and writes a story to a fz_document_writer. Avoids the need + * for calling code to implement a loop that calls fz_place_story() + * and fz_draw_story() etc, at the expense of having to provide a + * fz_write_story_rectfn() callback. + * + * story: + * The story to place and write. + * writer: + * Where to write the story; can be NULL. + * rectfn: + * Should return information about the rect to be used in the next + * internal call to fz_place_story(). + * rectfn_ref: + * Passed to rectfn(). + * positionfn: + * If not NULL, is called via internal calls to fz_story_positions(). + * positionfn_ref: + * Passed to positionfn(). + * pagefn: + * If not NULL, called at start and end of each page (before and after all + * story content has been written to the device). + * pagefn_ref: + * Passed to pagefn(). + */ +void fz_write_story( + fz_context *ctx, + fz_document_writer *writer, + fz_story *story, + fz_write_story_rectfn rectfn, + void *rectfn_ref, + fz_write_story_positionfn positionfn, + void *positionfn_ref, + fz_write_story_pagefn pagefn, + void *pagefn_ref + ); + + +/* + * Does iterative layout of html content to a fz_document_writer. For example + * this allows one to add a table of contents section while ensuring that page + * numbers are patched up until stable. + * + * Repeatedly creates new story from (contentfn(), contentfn_ref, user_css, em) + * and lays it out with internal call to fz_write_story(); uses a NULL writer + * and populates a fz_write_story_positions which is passed to the next call of + * contentfn(). + * + * When the html from contentfn() becomes unchanged, we do a final iteration + * using . + * + * writer: + * Where to write in the final iteration. + * user_css: + * Used in internal calls to fz_new_story(). + * em: + * Used in internal calls to fz_new_story(). + * contentfn: + * Should return html content for use with fz_new_story(), possibly + * including extra content such as a table-of-contents. + * contentfn_ref: + * Passed to contentfn(). + * rectfn: + * Should return information about the rect to be used in the next + * internal call to fz_place_story(). + * rectfn_ref: + * Passed to rectfn(). + * fz_write_story_pagefn: + * If not NULL, called at start and end of each page (before and after all + * story content has been written to the device). + * pagefn_ref: + * Passed to pagefn(). + * archive: + * NULL, or an archive to load images etc from. + */ +void fz_write_stabilized_story( + fz_context *ctx, + fz_document_writer *writer, + const char *user_css, + float em, + fz_write_story_contentfn contentfn, + void *contentfn_ref, + fz_write_story_rectfn rectfn, + void *rectfn_ref, + fz_write_story_pagefn pagefn, + void *pagefn_ref, + fz_archive *archive + ); + +#endif diff --git a/include/mupdf/fitz/story.h b/include/mupdf/fitz/story.h new file mode 100644 index 0000000..5a31c56 --- /dev/null +++ b/include/mupdf/fitz/story.h @@ -0,0 +1,206 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_STORY_H +#define MUPDF_FITZ_STORY_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/buffer.h" +#include "mupdf/fitz/device.h" +#include "mupdf/fitz/xml.h" +#include "mupdf/fitz/archive.h" + +/* + This header file provides an API for laying out and placing styled + text on a page, or pages. + + First a text story is created from some styled HTML. + + Next, this story can be laid out into a given rectangle (possibly + retrying several times with updated rectangles as required). + + Next, the laid out story can be drawn to a given device. + + In the case where the text story cannot be fitted into the given + areas all at once, these two steps can be repeated multiple + times until the text story is completely consumed. + + Finally, the text story can be dropped in the usual fashion. +*/ + + +typedef struct fz_story fz_story; + +/* + Create a text story using styled html. + + Passing a NULL buffer will be treated as an empty document. + Passing a NULL user_css will be treated as an empty CSS string. + A non-NULL archive will allow images etc to be loaded. The + story keeps its own reference, so the caller can drop its + reference after this call. +*/ +fz_story *fz_new_story(fz_context *ctx, fz_buffer *buf, const char *user_css, float em, fz_archive *archive); + +/* + Retrieve the warnings given from parsing this story. + + If there are warnings, this will be returned as a NULL terminated + C string. If there are no warnings, this will return NULL. + + These warnings will not be complete until AFTER any DOM manipulations + have been completed. + + This function does not need to be called, but once it has been + the DOM is no longer accessible, and any fz_xml pointer + retrieved from fz_story_docment is no longer valid. +*/ +const char *fz_story_warnings(fz_context *ctx, fz_story *story); + +/* + Place (or continue placing) a story into the supplied rectangle + 'where', updating 'filled' with the actual area that was used. + Returns zero if all the content fitted, non-zero if there is + more to fit. + + Note, that filled may not be returned as a strict subset of + where, due to padding/margins at the bottom of pages, and + non-wrapping content extending to the right. + + Subsequent calls will attempt to place the same section of story + again and again, until the placed story is drawn using fz_draw_story, + whereupon subsequent calls to fz_place_story will attempt to place + the unused remainder of the story. + + After this function is called, the DOM is no longer accessible, + and any fz_xml pointer retrieved from fz_story_document is no + longer valid. +*/ +int fz_place_story(fz_context *ctx, fz_story *story, fz_rect where, fz_rect *filled); + +/* + Draw the placed story to the given device. + + This moves the point at which subsequent calls to fz_place_story + will restart placing to the end of what has just been output. +*/ +void fz_draw_story(fz_context *ctx, fz_story *story, fz_device *dev, fz_matrix ctm); + +/* + Reset the position within the story at which the next layout call + will continue to the start of the story. +*/ +void fz_reset_story(fz_context *ctx, fz_story *story); + +/* + Drop the html story. +*/ +void fz_drop_story(fz_context *ctx, fz_story *story); + +/* + Get a borrowed reference to the DOM document pointer for this + story. Do not destroy this reference, it will be destroyed + when the story is laid out. + + This only makes sense before the first placement of the story + or retrieval of the warnings. Once either of those things happen + the DOM representation is destroyed. +*/ +fz_xml *fz_story_document(fz_context *ctx, fz_story *story); + + +typedef struct +{ + /* The overall depth of this element in the box structure. + * This can be used to compare the relative depths of different + * elements, but shouldn't be relied upon not to change between + * different versions of MuPDF. */ + int depth; + + /* The heading level of this element. 0 if not a header, or 1-6 for h1-h6. */ + int heading; + + /* The id for this element. */ + const char *id; + + /* The href for this element. */ + const char *href; + + /* The rectangle for this element. */ + fz_rect rect; + + /* The immediate text for this element. */ + const char *text; + + /* This indicates whether this opens and/or closes this element. + * + * As we traverse the tree we do a depth first search. In order for + * the caller of fz_story_positions to know whether a given element + * is inside another element, we therefore announce 'start' and 'stop' + * for each element. For instance, with: + * + *
+ *

Chapter 1

... + *

Chapter 2

... + * ... + *
+ *
+ *

Chapter 10

... + *

Chapter 11

... + * ... + *
+ * + * We would announce: + * + id='part1' (open) + * + header=1 "Chapter 1" (open/close) + * + header=1 "Chapter 2" (open/close) + * ... + * + id='part1' (close) + * + id='part2' (open) + * + header=1 "Chapter 10" (open/close) + * + header=1 "Chapter 11" (open/close) + * ... + * + id='part2' (close) + * + * If bit 0 is set, then this 'opens' the element. + * If bit 1 is set, then this 'closes' the element. + */ + int open_close; + + /* A count of the number of rectangles that the layout code has split the + * story into so far. After the first layout, this will be 1. If a + * layout is repeated, this number is not incremented. */ + int rectangle_num; +} fz_story_element_position; + +typedef void (fz_story_position_callback)(fz_context *ctx, void *arg, const fz_story_element_position *); + +/* + Enumerate the positions for key blocks in the story. + + This will cause the supplied function to be called with details of each + element in the story that is either a header, or has an id. +*/ +void fz_story_positions(fz_context *ctx, fz_story *story, fz_story_position_callback *cb, void *arg); + +#endif diff --git a/include/mupdf/fitz/stream.h b/include/mupdf/fitz/stream.h new file mode 100644 index 0000000..59ba2ee --- /dev/null +++ b/include/mupdf/fitz/stream.h @@ -0,0 +1,605 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_STREAM_H +#define MUPDF_FITZ_STREAM_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/buffer.h" + +/** + Return true if the named file exists and is readable. +*/ +int fz_file_exists(fz_context *ctx, const char *path); + +/** + fz_stream is a buffered reader capable of seeking in both + directions. + + Streams are reference counted, so references must be dropped + by a call to fz_drop_stream. + + Only the data between rp and wp is valid. +*/ +typedef struct fz_stream fz_stream; + +/** + Open the named file and wrap it in a stream. + + filename: Path to a file. On non-Windows machines the filename + should be exactly as it would be passed to fopen(2). On Windows + machines, the path should be UTF-8 encoded so that non-ASCII + characters can be represented. Other platforms do the encoding + as standard anyway (and in most cases, particularly for MacOS + and Linux, the encoding they use is UTF-8 anyway). +*/ +fz_stream *fz_open_file(fz_context *ctx, const char *filename); + +/** + Open the named file and wrap it in a stream. + + Does the same as fz_open_file, but in the event the file + does not open, it will return NULL rather than throw an + exception. +*/ +fz_stream *fz_try_open_file(fz_context *ctx, const char *name); + +#ifdef _WIN32 +/** + Open the named file and wrap it in a stream. + + This function is only available when compiling for Win32. + + filename: Wide character path to the file as it would be given + to _wfopen(). +*/ +fz_stream *fz_open_file_w(fz_context *ctx, const wchar_t *filename); +#endif /* _WIN32 */ + +/** + Open a block of memory as a stream. + + data: Pointer to start of data block. Ownership of the data + block is NOT passed in. + + len: Number of bytes in data block. + + Returns pointer to newly created stream. May throw exceptions on + failure to allocate. +*/ +fz_stream *fz_open_memory(fz_context *ctx, const unsigned char *data, size_t len); + +/** + Open a buffer as a stream. + + buf: The buffer to open. Ownership of the buffer is NOT passed + in (this function takes its own reference). + + Returns pointer to newly created stream. May throw exceptions on + failure to allocate. +*/ +fz_stream *fz_open_buffer(fz_context *ctx, fz_buffer *buf); + +/** + Attach a filter to a stream that will store any + characters read from the stream into the supplied buffer. + + chain: The underlying stream to leech from. + + buf: The buffer into which the read data should be appended. + The buffer will be resized as required. + + Returns pointer to newly created stream. May throw exceptions on + failure to allocate. +*/ +fz_stream *fz_open_leecher(fz_context *ctx, fz_stream *chain, fz_buffer *buf); + +/** + Increments the reference count for a stream. Returns the same + pointer. + + Never throws exceptions. +*/ +fz_stream *fz_keep_stream(fz_context *ctx, fz_stream *stm); + +/** + Decrements the reference count for a stream. + + When the reference count for the stream hits zero, frees the + storage used for the fz_stream itself, and (usually) + releases the underlying resources that the stream is based upon + (depends on the method used to open the stream initially). +*/ +void fz_drop_stream(fz_context *ctx, fz_stream *stm); + +/** + return the current reading position within a stream +*/ +int64_t fz_tell(fz_context *ctx, fz_stream *stm); + +/** + Seek within a stream. + + stm: The stream to seek within. + + offset: The offset to seek to. + + whence: From where the offset is measured (see fseek). +*/ +void fz_seek(fz_context *ctx, fz_stream *stm, int64_t offset, int whence); + +/** + Read from a stream into a given data block. + + stm: The stream to read from. + + data: The data block to read into. + + len: The length of the data block (in bytes). + + Returns the number of bytes read. May throw exceptions. +*/ +size_t fz_read(fz_context *ctx, fz_stream *stm, unsigned char *data, size_t len); + +/** + Read from a stream discarding data. + + stm: The stream to read from. + + len: The number of bytes to read. + + Returns the number of bytes read. May throw exceptions. +*/ +size_t fz_skip(fz_context *ctx, fz_stream *stm, size_t len); + +/** + Read all of a stream into a buffer. + + stm: The stream to read from + + initial: Suggested initial size for the buffer. + + Returns a buffer created from reading from the stream. May throw + exceptions on failure to allocate. +*/ +fz_buffer *fz_read_all(fz_context *ctx, fz_stream *stm, size_t initial); + +/** + Read all the contents of a file into a buffer. +*/ +fz_buffer *fz_read_file(fz_context *ctx, const char *filename); + +/** + Read all the contents of a file into a buffer. + + Returns NULL if the file does not exist, otherwise + behaves exactly as fz_read_file. +*/ +fz_buffer *fz_try_read_file(fz_context *ctx, const char *filename); + +/** + fz_read_[u]int(16|24|32|64)(_le)? + + Read a 16/32/64 bit signed/unsigned integer from stream, + in big or little-endian byte orders. + + Throws an exception if EOF is encountered. +*/ +uint16_t fz_read_uint16(fz_context *ctx, fz_stream *stm); +uint32_t fz_read_uint24(fz_context *ctx, fz_stream *stm); +uint32_t fz_read_uint32(fz_context *ctx, fz_stream *stm); +uint64_t fz_read_uint64(fz_context *ctx, fz_stream *stm); + +uint16_t fz_read_uint16_le(fz_context *ctx, fz_stream *stm); +uint32_t fz_read_uint24_le(fz_context *ctx, fz_stream *stm); +uint32_t fz_read_uint32_le(fz_context *ctx, fz_stream *stm); +uint64_t fz_read_uint64_le(fz_context *ctx, fz_stream *stm); + +int16_t fz_read_int16(fz_context *ctx, fz_stream *stm); +int32_t fz_read_int32(fz_context *ctx, fz_stream *stm); +int64_t fz_read_int64(fz_context *ctx, fz_stream *stm); + +int16_t fz_read_int16_le(fz_context *ctx, fz_stream *stm); +int32_t fz_read_int32_le(fz_context *ctx, fz_stream *stm); +int64_t fz_read_int64_le(fz_context *ctx, fz_stream *stm); + +float fz_read_float_le(fz_context *ctx, fz_stream *stm); +float fz_read_float(fz_context *ctx, fz_stream *stm); + +/** + Read a null terminated string from the stream into + a buffer of a given length. The buffer will be null terminated. + Throws on failure (including the failure to fit the entire + string including the terminator into the buffer). +*/ +void fz_read_string(fz_context *ctx, fz_stream *stm, char *buffer, int len); + +/** + A function type for use when implementing + fz_streams. The supplied function of this type is called + whenever data is required, and the current buffer is empty. + + stm: The stream to operate on. + + max: a hint as to the maximum number of bytes that the caller + needs to be ready immediately. Can safely be ignored. + + Returns -1 if there is no more data in the stream. Otherwise, + the function should find its internal state using stm->state, + refill its buffer, update stm->rp and stm->wp to point to the + start and end of the new data respectively, and then + "return *stm->rp++". +*/ +typedef int (fz_stream_next_fn)(fz_context *ctx, fz_stream *stm, size_t max); + +/** + A function type for use when implementing + fz_streams. The supplied function of this type is called + when the stream is dropped, to release the stream specific + state information. + + state: The stream state to release. +*/ +typedef void (fz_stream_drop_fn)(fz_context *ctx, void *state); + +/** + A function type for use when implementing + fz_streams. The supplied function of this type is called when + fz_seek is requested, and the arguments are as defined for + fz_seek. + + The stream can find it's private state in stm->state. +*/ +typedef void (fz_stream_seek_fn)(fz_context *ctx, fz_stream *stm, int64_t offset, int whence); + +struct fz_stream +{ + int refs; + int error; + int eof; + int progressive; + int64_t pos; + int avail; + int bits; + unsigned char *rp, *wp; + void *state; + fz_stream_next_fn *next; + fz_stream_drop_fn *drop; + fz_stream_seek_fn *seek; +}; + +/** + Create a new stream object with the given + internal state and function pointers. + + state: Internal state (opaque to everything but implementation). + + next: Should provide the next set of bytes (up to max) of stream + data. Return the number of bytes read, or EOF when there is no + more data. + + drop: Should clean up and free the internal state. May not + throw exceptions. +*/ +fz_stream *fz_new_stream(fz_context *ctx, void *state, fz_stream_next_fn *next, fz_stream_drop_fn *drop); + +/** + Attempt to read a stream into a buffer. If truncated + is NULL behaves as fz_read_all, sets a truncated flag in case of + error. + + stm: The stream to read from. + + initial: Suggested initial size for the buffer. + + truncated: Flag to store success/failure indication in. + + worst_case: 0 for unknown, otherwise an upper bound for the + size of the stream. + + Returns a buffer created from reading from the stream. +*/ +fz_buffer *fz_read_best(fz_context *ctx, fz_stream *stm, size_t initial, int *truncated, size_t worst_case); + +/** + Read a line from stream into the buffer until either a + terminating newline or EOF, which it replaces with a null byte + ('\0'). + + Returns buf on success, and NULL when end of file occurs while + no characters have been read. +*/ +char *fz_read_line(fz_context *ctx, fz_stream *stm, char *buf, size_t max); + +/** + Skip over a given string in a stream. Return 0 if successfully + skipped, non-zero otherwise. As many characters will be skipped + over as matched in the string. +*/ +int fz_skip_string(fz_context *ctx, fz_stream *stm, const char *str); + +/** + Skip over whitespace (bytes <= 32) in a stream. +*/ +void fz_skip_space(fz_context *ctx, fz_stream *stm); + +/** + Ask how many bytes are available immediately from + a given stream. + + stm: The stream to read from. + + max: A hint for the underlying stream; the maximum number of + bytes that we are sure we will want to read. If you do not know + this number, give 1. + + Returns the number of bytes immediately available between the + read and write pointers. This number is guaranteed only to be 0 + if we have hit EOF. The number of bytes returned here need have + no relation to max (could be larger, could be smaller). +*/ +static inline size_t fz_available(fz_context *ctx, fz_stream *stm, size_t max) +{ + size_t len = stm->wp - stm->rp; + int c = EOF; + + if (len) + return len; + if (stm->eof) + return 0; + + fz_try(ctx) + c = stm->next(ctx, stm, max); + fz_catch(ctx) + { + fz_rethrow_if(ctx, FZ_ERROR_TRYLATER); + fz_warn(ctx, "read error; treating as end of file"); + stm->error = 1; + c = EOF; + } + if (c == EOF) + { + stm->eof = 1; + return 0; + } + stm->rp--; + return stm->wp - stm->rp; +} + +/** + Read the next byte from a stream. + + stm: The stream t read from. + + Returns -1 for end of stream, or the next byte. May + throw exceptions. +*/ +static inline int fz_read_byte(fz_context *ctx, fz_stream *stm) +{ + int c = EOF; + + if (stm->rp != stm->wp) + return *stm->rp++; + if (stm->eof) + return EOF; + fz_try(ctx) + c = stm->next(ctx, stm, 1); + fz_catch(ctx) + { + fz_rethrow_if(ctx, FZ_ERROR_TRYLATER); + fz_warn(ctx, "read error; treating as end of file"); + stm->error = 1; + c = EOF; + } + if (c == EOF) + stm->eof = 1; + return c; +} + +/** + Peek at the next byte in a stream. + + stm: The stream to peek at. + + Returns -1 for EOF, or the next byte that will be read. +*/ +static inline int fz_peek_byte(fz_context *ctx, fz_stream *stm) +{ + int c = EOF; + + if (stm->rp != stm->wp) + return *stm->rp; + if (stm->eof) + return EOF; + + fz_try(ctx) + { + c = stm->next(ctx, stm, 1); + if (c != EOF) + stm->rp--; + } + fz_catch(ctx) + { + fz_rethrow_if(ctx, FZ_ERROR_TRYLATER); + fz_warn(ctx, "read error; treating as end of file"); + stm->error = 1; + c = EOF; + } + if (c == EOF) + stm->eof = 1; + return c; +} + +/** + Unread the single last byte successfully + read from a stream. Do not call this without having + successfully read a byte. + + stm: The stream to operate upon. +*/ +static inline void fz_unread_byte(fz_context *ctx FZ_UNUSED, fz_stream *stm) +{ + stm->rp--; +} + +/** + Query if the stream has reached EOF (during normal bytewise + reading). + + See fz_is_eof_bits for the equivalent function for bitwise + reading. +*/ +static inline int fz_is_eof(fz_context *ctx, fz_stream *stm) +{ + if (stm->rp == stm->wp) + { + if (stm->eof) + return 1; + return fz_peek_byte(ctx, stm) == EOF; + } + return 0; +} + +/** + Read the next n bits from a stream (assumed to + be packed most significant bit first). + + stm: The stream to read from. + + n: The number of bits to read, between 1 and 8*sizeof(int) + inclusive. + + Returns -1 for EOF, or the required number of bits. +*/ +static inline unsigned int fz_read_bits(fz_context *ctx, fz_stream *stm, int n) +{ + int x; + + if (n <= stm->avail) + { + stm->avail -= n; + x = (stm->bits >> stm->avail) & ((1 << n) - 1); + } + else + { + x = stm->bits & ((1 << stm->avail) - 1); + n -= stm->avail; + stm->avail = 0; + + while (n > 8) + { + x = (x << 8) | fz_read_byte(ctx, stm); + n -= 8; + } + + if (n > 0) + { + stm->bits = fz_read_byte(ctx, stm); + stm->avail = 8 - n; + x = (x << n) | (stm->bits >> stm->avail); + } + } + + return x; +} + +/** + Read the next n bits from a stream (assumed to + be packed least significant bit first). + + stm: The stream to read from. + + n: The number of bits to read, between 1 and 8*sizeof(int) + inclusive. + + Returns (unsigned int)-1 for EOF, or the required number of bits. +*/ +static inline unsigned int fz_read_rbits(fz_context *ctx, fz_stream *stm, int n) +{ + int x; + + if (n <= stm->avail) + { + x = stm->bits & ((1 << n) - 1); + stm->avail -= n; + stm->bits = stm->bits >> n; + } + else + { + unsigned int used = 0; + + x = stm->bits & ((1 << stm->avail) - 1); + n -= stm->avail; + used = stm->avail; + stm->avail = 0; + + while (n > 8) + { + x = (fz_read_byte(ctx, stm) << used) | x; + n -= 8; + used += 8; + } + + if (n > 0) + { + stm->bits = fz_read_byte(ctx, stm); + x = ((stm->bits & ((1 << n) - 1)) << used) | x; + stm->avail = 8 - n; + stm->bits = stm->bits >> n; + } + } + + return x; +} + +/** + Called after reading bits to tell the stream + that we are about to return to reading bytewise. Resyncs + the stream to whole byte boundaries. +*/ +static inline void fz_sync_bits(fz_context *ctx FZ_UNUSED, fz_stream *stm) +{ + stm->avail = 0; +} + +/** + Query if the stream has reached EOF (during bitwise + reading). + + See fz_is_eof for the equivalent function for bytewise + reading. +*/ +static inline int fz_is_eof_bits(fz_context *ctx, fz_stream *stm) +{ + return fz_is_eof(ctx, stm) && (stm->avail == 0 || stm->bits == EOF); +} + +/* Implementation details: subject to change. */ + +/** + Create a stream from a FILE * that will not be closed + when the stream is dropped. +*/ +fz_stream *fz_open_file_ptr_no_close(fz_context *ctx, FILE *file); + +#endif diff --git a/include/mupdf/fitz/string-util.h b/include/mupdf/fitz/string-util.h new file mode 100644 index 0000000..d2bba64 --- /dev/null +++ b/include/mupdf/fitz/string-util.h @@ -0,0 +1,267 @@ +// Copyright (C) 2004-2022 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_STRING_H +#define MUPDF_FITZ_STRING_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" + +/* The Unicode character used to incoming character whose value is + * unknown or unrepresentable. */ +#define FZ_REPLACEMENT_CHARACTER 0xFFFD + +/** + Safe string functions +*/ + +/** + Return strlen(s), if that is less than maxlen, or maxlen if + there is no null byte ('\0') among the first maxlen bytes. +*/ +size_t fz_strnlen(const char *s, size_t maxlen); + +/** + Given a pointer to a C string (or a pointer to NULL) break + it at the first occurrence of a delimiter char (from a given + set). + + stringp: Pointer to a C string pointer (or NULL). Updated on + exit to point to the first char of the string after the + delimiter that was found. The string pointed to by stringp will + be corrupted by this call (as the found delimiter will be + overwritten by 0). + + delim: A C string of acceptable delimiter characters. + + Returns a pointer to a C string containing the chars of stringp + up to the first delimiter char (or the end of the string), or + NULL. +*/ +char *fz_strsep(char **stringp, const char *delim); + +/** + Copy at most n-1 chars of a string into a destination + buffer with null termination, returning the real length of the + initial string (excluding terminator). + + dst: Destination buffer, at least n bytes long. + + src: C string (non-NULL). + + n: Size of dst buffer in bytes. + + Returns the length (excluding terminator) of src. +*/ +size_t fz_strlcpy(char *dst, const char *src, size_t n); + +/** + Concatenate 2 strings, with a maximum length. + + dst: pointer to first string in a buffer of n bytes. + + src: pointer to string to concatenate. + + n: Size (in bytes) of buffer that dst is in. + + Returns the real length that a concatenated dst + src would have + been (not including terminator). +*/ +size_t fz_strlcat(char *dst, const char *src, size_t n); + +/** + Find the start of the first occurrence of the substring needle in haystack. +*/ +void *fz_memmem(const void *haystack, size_t haystacklen, const void *needle, size_t needlelen); + +/** + extract the directory component from a path. +*/ +void fz_dirname(char *dir, const char *path, size_t dirsize); + +/** + Find the filename component in a path. +*/ +const char *fz_basename(const char *path); + +/** + Like fz_decode_uri_component but in-place. +*/ +char *fz_urldecode(char *url); + +/** + * Return a new string representing the unencoded version of the given URI. + * This decodes all escape sequences except those that would result in a reserved + * character that are part of the URI syntax (; / ? : @ & = + $ , #). + */ +char *fz_decode_uri(fz_context *ctx, const char *s); + +/** + * Return a new string representing the unencoded version of the given URI component. + * This decodes all escape sequences! + */ +char *fz_decode_uri_component(fz_context *ctx, const char *s); + +/** + * Return a new string representing the provided string encoded as a URI. + */ +char *fz_encode_uri(fz_context *ctx, const char *s); + +/** + * Return a new string representing the provided string encoded as an URI component. + * This also encodes the special reserved characters (; / ? : @ & = + $ , #). + */ +char *fz_encode_uri_component(fz_context *ctx, const char *s); + +/** + * Return a new string representing the provided string encoded as an URI path name. + * This also encodes the special reserved characters except /. + */ +char *fz_encode_uri_pathname(fz_context *ctx, const char *s); + +/** + create output file name using a template. + + If the path contains %[0-9]*d, the first such pattern will be + replaced with the page number. If the template does not contain + such a pattern, the page number will be inserted before the + filename extension. If the template does not have a filename + extension, the page number will be added to the end. +*/ +void fz_format_output_path(fz_context *ctx, char *path, size_t size, const char *fmt, int page); + +/** + rewrite path to the shortest string that names the same path. + + Eliminates multiple and trailing slashes, interprets "." and + "..". Overwrites the string in place. +*/ +char *fz_cleanname(char *name); + +/** + Resolve a path to an absolute file name. + The resolved path buffer must be of at least PATH_MAX size. +*/ +char *fz_realpath(const char *path, char *resolved_path); + +/** + Case insensitive (ASCII only) string comparison. +*/ +int fz_strcasecmp(const char *a, const char *b); +int fz_strncasecmp(const char *a, const char *b, size_t n); + +/** + FZ_UTFMAX: Maximum number of bytes in a decoded rune (maximum + length returned by fz_chartorune). +*/ +enum { FZ_UTFMAX = 4 }; + +/** + UTF8 decode a single rune from a sequence of chars. + + rune: Pointer to an int to assign the decoded 'rune' to. + + str: Pointer to a UTF8 encoded string. + + Returns the number of bytes consumed. +*/ +int fz_chartorune(int *rune, const char *str); + +/** + UTF8 encode a rune to a sequence of chars. + + str: Pointer to a place to put the UTF8 encoded character. + + rune: Pointer to a 'rune'. + + Returns the number of bytes the rune took to output. +*/ +int fz_runetochar(char *str, int rune); + +/** + Count how many chars are required to represent a rune. + + rune: The rune to encode. + + Returns the number of bytes required to represent this run in + UTF8. +*/ +int fz_runelen(int rune); + +/** + Compute the index of a rune in a string. + + str: Pointer to beginning of a string. + + p: Pointer to a char in str. + + Returns the index of the rune pointed to by p in str. +*/ +int fz_runeidx(const char *str, const char *p); + +/** + Obtain a pointer to the char representing the rune + at a given index. + + str: Pointer to beginning of a string. + + idx: Index of a rune to return a char pointer to. + + Returns a pointer to the char where the desired rune starts, + or NULL if the string ends before the index is reached. +*/ +const char *fz_runeptr(const char *str, int idx); + +/** + Count how many runes the UTF-8 encoded string + consists of. + + s: The UTF-8 encoded, NUL-terminated text string. + + Returns the number of runes in the string. +*/ +int fz_utflen(const char *s); + +/** + Locale-independent decimal to binary conversion. On overflow + return (-)INFINITY and set errno to ERANGE. On underflow return + 0 and set errno to ERANGE. Special inputs (case insensitive): + "NAN", "INF" or "INFINITY". +*/ +float fz_strtof(const char *s, char **es); + +int fz_grisu(float f, char *s, int *exp); + +/** + Check and parse string into page ranges: + /,?(-?\d+|N)(-(-?\d+|N))?/ +*/ +int fz_is_page_range(fz_context *ctx, const char *s); +const char *fz_parse_page_range(fz_context *ctx, const char *s, int *a, int *b, int n); + +/** + Unicode aware tolower and toupper functions. +*/ +int fz_tolower(int c); +int fz_toupper(int c); + +#endif diff --git a/include/mupdf/fitz/structured-text.h b/include/mupdf/fitz/structured-text.h new file mode 100644 index 0000000..b259517 --- /dev/null +++ b/include/mupdf/fitz/structured-text.h @@ -0,0 +1,363 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_STRUCTURED_TEXT_H +#define MUPDF_FITZ_STRUCTURED_TEXT_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/types.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/geometry.h" +#include "mupdf/fitz/font.h" +#include "mupdf/fitz/image.h" +#include "mupdf/fitz/output.h" +#include "mupdf/fitz/device.h" +#include "mupdf/fitz/pool.h" + +/** + Simple text layout (for use with annotation editing primarily). +*/ +typedef struct fz_layout_char +{ + float x, advance; + const char *p; /* location in source text of character */ + struct fz_layout_char *next; +} fz_layout_char; + +typedef struct fz_layout_line +{ + float x, y, font_size; + const char *p; /* location in source text of start of line */ + fz_layout_char *text; + struct fz_layout_line *next; +} fz_layout_line; + +typedef struct +{ + fz_pool *pool; + fz_matrix matrix; + fz_matrix inv_matrix; + fz_layout_line *head, **tailp; + fz_layout_char **text_tailp; +} fz_layout_block; + +/** + Create a new layout block, with new allocation pool, zero + matrices, and initialise linked pointers. +*/ +fz_layout_block *fz_new_layout(fz_context *ctx); + +/** + Drop layout block. Free the pool, and linked blocks. + + Never throws exceptions. +*/ +void fz_drop_layout(fz_context *ctx, fz_layout_block *block); + +/** + Add a new line to the end of the layout block. +*/ +void fz_add_layout_line(fz_context *ctx, fz_layout_block *block, float x, float y, float h, const char *p); + +/** + Add a new char to the line at the end of the layout block. +*/ +void fz_add_layout_char(fz_context *ctx, fz_layout_block *block, float x, float w, const char *p); + +/** + Text extraction device: Used for searching, format conversion etc. + + (In development - Subject to change in future versions) +*/ + +typedef struct fz_stext_char fz_stext_char; +typedef struct fz_stext_line fz_stext_line; +typedef struct fz_stext_block fz_stext_block; + +/** + FZ_STEXT_PRESERVE_LIGATURES: If this option is activated + ligatures are passed through to the application in their + original form. If this option is deactivated ligatures are + expanded into their constituent parts, e.g. the ligature ffi is + expanded into three separate characters f, f and i. + + FZ_STEXT_PRESERVE_WHITESPACE: If this option is activated + whitespace is passed through to the application in its original + form. If this option is deactivated any type of horizontal + whitespace (including horizontal tabs) will be replaced with + space characters of variable width. + + FZ_STEXT_PRESERVE_IMAGES: If this option is set, then images + will be stored in the structured text structure. The default is + to ignore all images. + + FZ_STEXT_INHIBIT_SPACES: If this option is set, we will not try + to add missing space characters where there are large gaps + between characters. + + FZ_STEXT_DEHYPHENATE: If this option is set, hyphens at the + end of a line will be removed and the lines will be merged. + + FZ_STEXT_PRESERVE_SPANS: If this option is set, spans on the same line + will not be merged. Each line will thus be a span of text with the same + font, colour, and size. + + FZ_STEXT_MEDIABOX_CLIP: If this option is set, characters entirely + outside each page's mediabox will be ignored. +*/ +enum +{ + FZ_STEXT_PRESERVE_LIGATURES = 1, + FZ_STEXT_PRESERVE_WHITESPACE = 2, + FZ_STEXT_PRESERVE_IMAGES = 4, + FZ_STEXT_INHIBIT_SPACES = 8, + FZ_STEXT_DEHYPHENATE = 16, + FZ_STEXT_PRESERVE_SPANS = 32, + FZ_STEXT_MEDIABOX_CLIP = 64, +}; + +/** + A text page is a list of blocks, together with an overall + bounding box. +*/ +typedef struct +{ + fz_pool *pool; + fz_rect mediabox; + fz_stext_block *first_block, *last_block; +} fz_stext_page; + +enum +{ + FZ_STEXT_BLOCK_TEXT = 0, + FZ_STEXT_BLOCK_IMAGE = 1 +}; + +/** + A text block is a list of lines of text (typically a paragraph), + or an image. +*/ +struct fz_stext_block +{ + int type; + fz_rect bbox; + union { + struct { fz_stext_line *first_line, *last_line; } t; + struct { fz_matrix transform; fz_image *image; } i; + } u; + fz_stext_block *prev, *next; +}; + +/** + A text line is a list of characters that share a common baseline. +*/ +struct fz_stext_line +{ + int wmode; /* 0 for horizontal, 1 for vertical */ + fz_point dir; /* normalized direction of baseline */ + fz_rect bbox; + fz_stext_char *first_char, *last_char; + fz_stext_line *prev, *next; +}; + +/** + A text char is a unicode character, the style in which is + appears, and the point at which it is positioned. +*/ +struct fz_stext_char +{ + int c; + int color; /* sRGB hex color */ + fz_point origin; + fz_quad quad; + float size; + fz_font *font; + fz_stext_char *next; +}; + +FZ_DATA extern const char *fz_stext_options_usage; + +/** + Create an empty text page. + + The text page is filled out by the text device to contain the + blocks and lines of text on the page. + + mediabox: optional mediabox information. +*/ +fz_stext_page *fz_new_stext_page(fz_context *ctx, fz_rect mediabox); +void fz_drop_stext_page(fz_context *ctx, fz_stext_page *page); + +/** + Output structured text to a file in HTML (visual) format. +*/ +void fz_print_stext_page_as_html(fz_context *ctx, fz_output *out, fz_stext_page *page, int id); +void fz_print_stext_header_as_html(fz_context *ctx, fz_output *out); +void fz_print_stext_trailer_as_html(fz_context *ctx, fz_output *out); + +/** + Output structured text to a file in XHTML (semantic) format. +*/ +void fz_print_stext_page_as_xhtml(fz_context *ctx, fz_output *out, fz_stext_page *page, int id); +void fz_print_stext_header_as_xhtml(fz_context *ctx, fz_output *out); +void fz_print_stext_trailer_as_xhtml(fz_context *ctx, fz_output *out); + +/** + Output structured text to a file in XML format. +*/ +void fz_print_stext_page_as_xml(fz_context *ctx, fz_output *out, fz_stext_page *page, int id); + +/** + Output structured text to a file in JSON format. +*/ +void fz_print_stext_page_as_json(fz_context *ctx, fz_output *out, fz_stext_page *page, float scale); + +/** + Output structured text to a file in plain-text UTF-8 format. +*/ +void fz_print_stext_page_as_text(fz_context *ctx, fz_output *out, fz_stext_page *page); + +/** + Search for occurrence of 'needle' in text page. + + Return the number of hits and store hit quads in the passed in + array. + + NOTE: This is an experimental interface and subject to change + without notice. +*/ +int fz_search_stext_page(fz_context *ctx, fz_stext_page *text, const char *needle, int *hit_mark, fz_quad *hit_bbox, int hit_max); + +/** + Return a list of quads to highlight lines inside the selection + points. +*/ +int fz_highlight_selection(fz_context *ctx, fz_stext_page *page, fz_point a, fz_point b, fz_quad *quads, int max_quads); + +enum +{ + FZ_SELECT_CHARS, + FZ_SELECT_WORDS, + FZ_SELECT_LINES, +}; + +fz_quad fz_snap_selection(fz_context *ctx, fz_stext_page *page, fz_point *ap, fz_point *bp, int mode); + +/** + Return a newly allocated UTF-8 string with the text for a given + selection. + + crlf: If true, write "\r\n" style line endings (otherwise "\n" + only). +*/ +char *fz_copy_selection(fz_context *ctx, fz_stext_page *page, fz_point a, fz_point b, int crlf); + +/** + Return a newly allocated UTF-8 string with the text for a given + selection rectangle. + + crlf: If true, write "\r\n" style line endings (otherwise "\n" + only). +*/ +char *fz_copy_rectangle(fz_context *ctx, fz_stext_page *page, fz_rect area, int crlf); + +/** + Options for creating a pixmap and draw device. +*/ +typedef struct +{ + int flags; + float scale; +} fz_stext_options; + +/** + Parse stext device options from a comma separated key-value + string. +*/ +fz_stext_options *fz_parse_stext_options(fz_context *ctx, fz_stext_options *opts, const char *string); + +/** + Create a device to extract the text on a page. + + Gather the text on a page into blocks and lines. + + The reading order is taken from the order the text is drawn in + the source file, so may not be accurate. + + page: The text page to which content should be added. This will + usually be a newly created (empty) text page, but it can be one + containing data already (for example when merging multiple + pages, or watermarking). + + options: Options to configure the stext device. +*/ +fz_device *fz_new_stext_device(fz_context *ctx, fz_stext_page *page, const fz_stext_options *options); + +/** + Create a device to OCR the text on the page. + + Renders the page internally to a bitmap that is then OCRd. Text + is then forwarded onto the target device. + + target: The target device to receive the OCRd text. + + ctm: The transform to apply to the mediabox to get the size for + the rendered page image. Also used to calculate the resolution + for the page image. In general, this will be the same as the CTM + that you pass to fz_run_page (or fz_run_display_list) to feed + this device. + + mediabox: The mediabox (in points). Combined with the CTM to get + the bounds of the pixmap used internally for the rendered page + image. + + with_list: If with_list is false, then all non-text operations + are forwarded instantly to the target device. This results in + the target device seeing all NON-text operations, followed by + all the text operations (derived from OCR). + + If with_list is true, then all the marking operations are + collated into a display list which is then replayed to the + target device at the end. + + language: NULL (for "eng"), or a pointer to a string to describe + the languages/scripts that should be used for OCR (e.g. + "eng,ara"). + + datadir: NULL (for ""), or a pointer to a path string otherwise + provided to Tesseract in the TESSDATA_PREFIX environment variable. + + progress: NULL, or function to be called periodically to indicate + progress. Return 0 to continue, or 1 to cancel. progress_arg is + returned as the void *. The int is a value between 0 and 100 to + indicate progress. + + progress_arg: A void * value to be parrotted back to the progress + function. +*/ +fz_device *fz_new_ocr_device(fz_context *ctx, fz_device *target, fz_matrix ctm, fz_rect mediabox, int with_list, const char *language, + const char *datadir, int (*progress)(fz_context *, void *, int), void *progress_arg); + +fz_document *fz_open_reflowed_document(fz_context *ctx, fz_document *underdoc, const fz_stext_options *opts); + + +#endif diff --git a/include/mupdf/fitz/system.h b/include/mupdf/fitz/system.h new file mode 100644 index 0000000..7f8a564 --- /dev/null +++ b/include/mupdf/fitz/system.h @@ -0,0 +1,460 @@ +// Copyright (C) 2004-2022 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_SYSTEM_H +#define MUPDF_FITZ_SYSTEM_H + +/* Turn on valgrind pacification in debug builds. */ +#ifndef NDEBUG +#ifndef PACIFY_VALGRIND +#define PACIFY_VALGRIND +#endif +#endif + +/** + Include the standard libc headers. +*/ + +#include /* needed for size_t */ +#include /* needed for va_list vararg functions */ +#include /* needed for the try/catch macros */ +#include /* useful for debug printfs */ + +#include "export.h" + +#if defined(_MSC_VER) && (_MSC_VER < 1700) /* MSVC older than VS2012 */ +typedef signed char int8_t; +typedef short int int16_t; +typedef int int32_t; +typedef __int64 int64_t; +typedef unsigned char uint8_t; +typedef unsigned short int uint16_t; +typedef unsigned int uint32_t; +typedef unsigned __int64 uint64_t; +#ifndef INT64_MAX +#define INT64_MAX 9223372036854775807i64 +#endif +#else +#include /* needed for int64_t */ +#endif + +#include "mupdf/memento.h" +#include "mupdf/fitz/track-usage.h" + +#define nelem(x) (sizeof(x)/sizeof((x)[0])) + +#define FZ_PI 3.14159265f +#define FZ_RADIAN 57.2957795f +#define FZ_DEGREE 0.017453292f +#define FZ_SQRT2 1.41421356f +#define FZ_LN2 0.69314718f + +/** + Spot architectures where we have optimisations. +*/ + +#if defined(__arm__) || defined(__thumb__) +#ifndef ARCH_ARM +#define ARCH_ARM +#endif +#endif + +/** + Some differences in libc can be smoothed over +*/ + +#ifndef __STRICT_ANSI__ +#if defined(__APPLE__) +#ifndef HAVE_SIGSETJMP +#define HAVE_SIGSETJMP 1 +#endif +#elif defined(__unix) +#ifndef __EMSCRIPTEN__ +#ifndef HAVE_SIGSETJMP +#define HAVE_SIGSETJMP 1 +#endif +#endif +#endif +#endif +#ifndef HAVE_SIGSETJMP +#define HAVE_SIGSETJMP 0 +#endif + +/** + Where possible (i.e. on platforms on which they are provided), + use sigsetjmp/siglongjmp in preference to setjmp/longjmp. We + don't alter signal handlers within mupdf, so there is no need + for us to store/restore them - hence we use the non-restoring + variants. This makes a large speed difference on MacOSX (and + probably other platforms too. +*/ +#if HAVE_SIGSETJMP +#define fz_setjmp(BUF) sigsetjmp(BUF, 0) +#define fz_longjmp(BUF,VAL) siglongjmp(BUF, VAL) +typedef sigjmp_buf fz_jmp_buf; +#else +#define fz_setjmp(BUF) setjmp(BUF) +#define fz_longjmp(BUF,VAL) longjmp(BUF,VAL) +typedef jmp_buf fz_jmp_buf; +#endif + +/* these constants mirror the corresponding macros in stdio.h */ +#ifndef EOF +#define EOF (-1) +#endif +#ifndef SEEK_SET +#define SEEK_SET 0 +#endif +#ifndef SEEK_CUR +#define SEEK_CUR 1 +#endif +#ifndef SEEK_END +#define SEEK_END 2 +#endif + +#ifdef _MSC_VER /* Microsoft Visual C */ + +/* MSVC up to VS2012 */ +#if _MSC_VER < 1800 +static __inline int signbit(double x) +{ + union + { + double d; + __int64 i; + } u; + u.d = x; + return (int)(u.i>>63); +} +#endif + +#pragma warning( disable: 4244 ) /* conversion from X to Y, possible loss of data */ +#pragma warning( disable: 4701 ) /* Potentially uninitialized local variable 'name' used */ +#pragma warning( disable: 4996 ) /* 'function': was declared deprecated */ + +#if _MSC_VER <= 1700 /* MSVC 2012 */ +#define isnan(x) _isnan(x) +#define isinf(x) (!_finite(x)) +#endif + +#if _MSC_VER <= 1920 /* MSVC 2019 */ +#define hypotf _hypotf +#endif +#define atoll _atoi64 + +#endif + +#ifdef _WIN32 + +char *fz_utf8_from_wchar(const wchar_t *s); +wchar_t *fz_wchar_from_utf8(const char *s); + +/* really a FILE* but we don't want to include stdio.h here */ +void *fz_fopen_utf8(const char *name, const char *mode); +int fz_remove_utf8(const char *name); + +char **fz_argv_from_wargv(int argc, wchar_t **wargv); +void fz_free_argv(int argc, char **argv); + +#endif + +/* Cope with systems (such as Windows) with no S_ISDIR */ +#ifndef S_ISDIR +#define S_ISDIR(mode) ((mode) & S_IFDIR) +#endif + +int64_t fz_stat_ctime(const char *path); +int64_t fz_stat_mtime(const char *path); + +/* inline is standard in C++. For some compilers we can enable it within + * C too. Some compilers think they know better than we do about when + * to actually honour inline (particularly for large functions); use + * fz_forceinline to kick them into really inlining. */ + +#ifndef __cplusplus +#if defined (__STDC_VERSION_) && (__STDC_VERSION__ >= 199901L) /* C99 */ +#elif defined(_MSC_VER) && (_MSC_VER >= 1500) /* MSVC 9 or newer */ +#define inline __inline +#define fz_forceinline __forceinline +#elif defined(__GNUC__) && (__GNUC__ >= 3) /* GCC 3 or newer */ +#define inline __inline +#else /* Unknown or ancient */ +#define inline +#endif +#endif + +#ifndef fz_forceinline +#define fz_forceinline inline +#endif + +/* restrict is standard in C99, but not in all C++ compilers. */ +#if defined (__STDC_VERSION_) && (__STDC_VERSION__ >= 199901L) /* C99 */ +#define FZ_RESTRICT restrict +#elif defined(_MSC_VER) && (_MSC_VER >= 1600) /* MSVC 10 or newer */ +#define FZ_RESTRICT __restrict +#elif defined(__GNUC__) && (__GNUC__ >= 3) /* GCC 3 or newer */ +#define FZ_RESTRICT __restrict +#else /* Unknown or ancient */ +#define FZ_RESTRICT +#endif + +/* noreturn is a GCC extension */ +#ifdef __GNUC__ +#define FZ_NORETURN __attribute__((noreturn)) +#else +#ifdef _MSC_VER +#define FZ_NORETURN __declspec(noreturn) +#else +#define FZ_NORETURN +#endif +#endif + +/* Flag unused parameters, for use with 'static inline' functions in + * headers. */ +#if defined(__GNUC__) && (__GNUC__ > 2 || __GNUC__ == 2 && __GNUC_MINOR__ >= 7) +#define FZ_UNUSED __attribute__((__unused__)) +#else +#define FZ_UNUSED +#endif + +/* GCC can do type checking of printf strings */ +#ifdef __printflike +#define FZ_PRINTFLIKE(F,V) __printflike(F,V) +#else +#if defined(__GNUC__) && (__GNUC__ > 2 || __GNUC__ == 2 && __GNUC_MINOR__ >= 7) +#define FZ_PRINTFLIKE(F,V) __attribute__((__format__ (__printf__, F, V))) +#else +#define FZ_PRINTFLIKE(F,V) +#endif +#endif + +/* ARM assembly specific defines */ + +#ifdef ARCH_ARM + +/* If we're compiling as thumb code, then we need to tell the compiler + * to enter and exit ARM mode around our assembly sections. If we move + * the ARM functions to a separate file and arrange for it to be + * compiled without thumb mode, we can save some time on entry. + */ +/* This is slightly suboptimal; __thumb__ and __thumb2__ become defined + * and undefined by #pragma arm/#pragma thumb - but we can't define a + * macro to track that. */ +#if defined(__thumb__) || defined(__thumb2__) +#define ENTER_ARM ".balign 4\nmov r12,pc\nbx r12\n0:.arm\n" +#define ENTER_THUMB "9:.thumb\n" +#else +#define ENTER_ARM +#define ENTER_THUMB +#endif + +#endif + +/* Memory block alignment */ + +/* Most architectures are happy with blocks being aligned to the size + * of void *'s. Some (notably sparc) are not. + * + * Some architectures (notably amd64) are happy for pointers to be 32bit + * aligned even on 64bit systems. By making use of this we can save lots + * of memory in data structures (notably the display list). + * + * We attempt to cope with these vagaries via the following definitions. + */ + +/* All blocks allocated by mupdf's allocators are expected to be + * returned aligned to FZ_MEMORY_BLOCK_ALIGN_MOD. This is sizeof(void *) + * unless overwritten by a predefinition, or by a specific architecture + * being detected. */ +#ifndef FZ_MEMORY_BLOCK_ALIGN_MOD +#if defined(sparc) || defined(__sparc) || defined(__sparc__) +#define FZ_MEMORY_BLOCK_ALIGN_MOD 8 +#else +#define FZ_MEMORY_BLOCK_ALIGN_MOD sizeof(void *) +#endif +#endif + +/* MuPDF will ensure that its use of pointers in packed structures + * (such as the display list) will be aligned to FZ_POINTER_ALIGN_MOD. + * This is the same as FZ_MEMORY_BLOCK_ALIGN_MOD unless overridden by + * a predefinition, or by a specific architecture being detected. */ +#ifndef FZ_POINTER_ALIGN_MOD +#if defined(__amd64) || defined(__amd64__) || defined(__x86_64) || defined(__x86_64__) +#define FZ_POINTER_ALIGN_MOD 4 +#else +#define FZ_POINTER_ALIGN_MOD FZ_MEMORY_BLOCK_ALIGN_MOD +#endif +#endif + +#ifdef CLUSTER +/* Include this first so our defines don't clash with the system + * definitions */ +#include +/** + * Trig functions + */ +static float +my_atan_table[258] = +{ +0.0000000000f, 0.00390623013f,0.00781234106f,0.0117182136f, +0.0156237286f, 0.0195287670f, 0.0234332099f, 0.0273369383f, +0.0312398334f, 0.0351417768f, 0.0390426500f, 0.0429423347f, +0.0468407129f, 0.0507376669f, 0.0546330792f, 0.0585268326f, +0.0624188100f, 0.0663088949f, 0.0701969711f, 0.0740829225f, +0.0779666338f, 0.0818479898f, 0.0857268758f, 0.0896031775f, +0.0934767812f, 0.0973475735f, 0.1012154420f, 0.1050802730f, +0.1089419570f, 0.1128003810f, 0.1166554350f, 0.1205070100f, +0.1243549950f, 0.1281992810f, 0.1320397620f, 0.1358763280f, +0.1397088740f, 0.1435372940f, 0.1473614810f, 0.1511813320f, +0.1549967420f, 0.1588076080f, 0.1626138290f, 0.1664153010f, +0.1702119250f, 0.1740036010f, 0.1777902290f, 0.1815717110f, +0.1853479500f, 0.1891188490f, 0.1928843120f, 0.1966442450f, +0.2003985540f, 0.2041471450f, 0.2078899270f, 0.2116268090f, +0.2153577000f, 0.2190825110f, 0.2228011540f, 0.2265135410f, +0.2302195870f, 0.2339192060f, 0.2376123140f, 0.2412988270f, +0.2449786630f, 0.2486517410f, 0.2523179810f, 0.2559773030f, +0.2596296290f, 0.2632748830f, 0.2669129880f, 0.2705438680f, +0.2741674510f, 0.2777836630f, 0.2813924330f, 0.2849936890f, +0.2885873620f, 0.2921733830f, 0.2957516860f, 0.2993222020f, +0.3028848680f, 0.3064396190f, 0.3099863910f, 0.3135251230f, +0.3170557530f, 0.3205782220f, 0.3240924700f, 0.3275984410f, +0.3310960770f, 0.3345853220f, 0.3380661230f, 0.3415384250f, +0.3450021770f, 0.3484573270f, 0.3519038250f, 0.3553416220f, +0.3587706700f, 0.3621909220f, 0.3656023320f, 0.3690048540f, +0.3723984470f, 0.3757830650f, 0.3791586690f, 0.3825252170f, +0.3858826690f, 0.3892309880f, 0.3925701350f, 0.3959000740f, +0.3992207700f, 0.4025321870f, 0.4058342930f, 0.4091270550f, +0.4124104420f, 0.4156844220f, 0.4189489670f, 0.4222040480f, +0.4254496370f, 0.4286857080f, 0.4319122350f, 0.4351291940f, +0.4383365600f, 0.4415343100f, 0.4447224240f, 0.4479008790f, +0.4510696560f, 0.4542287350f, 0.4573780990f, 0.4605177290f, +0.4636476090f, 0.4667677240f, 0.4698780580f, 0.4729785980f, +0.4760693300f, 0.4791502430f, 0.4822213240f, 0.4852825630f, +0.4883339510f, 0.4913754780f, 0.4944071350f, 0.4974289160f, +0.5004408130f, 0.5034428210f, 0.5064349340f, 0.5094171490f, +0.5123894600f, 0.5153518660f, 0.5183043630f, 0.5212469510f, +0.5241796290f, 0.5271023950f, 0.5300152510f, 0.5329181980f, +0.5358112380f, 0.5386943730f, 0.5415676050f, 0.5444309400f, +0.5472843810f, 0.5501279330f, 0.5529616020f, 0.5557853940f, +0.5585993150f, 0.5614033740f, 0.5641975770f, 0.5669819340f, +0.5697564530f, 0.5725211450f, 0.5752760180f, 0.5780210840f, +0.5807563530f, 0.5834818390f, 0.5861975510f, 0.5889035040f, +0.5915997100f, 0.5942861830f, 0.5969629370f, 0.5996299860f, +0.6022873460f, 0.6049350310f, 0.6075730580f, 0.6102014430f, +0.6128202020f, 0.6154293530f, 0.6180289120f, 0.6206188990f, +0.6231993300f, 0.6257702250f, 0.6283316020f, 0.6308834820f, +0.6334258830f, 0.6359588250f, 0.6384823300f, 0.6409964180f, +0.6435011090f, 0.6459964250f, 0.6484823880f, 0.6509590190f, +0.6534263410f, 0.6558843770f, 0.6583331480f, 0.6607726790f, +0.6632029930f, 0.6656241120f, 0.6680360620f, 0.6704388650f, +0.6728325470f, 0.6752171330f, 0.6775926450f, 0.6799591110f, +0.6823165550f, 0.6846650020f, 0.6870044780f, 0.6893350100f, +0.6916566220f, 0.6939693410f, 0.6962731940f, 0.6985682070f, +0.7008544080f, 0.7031318220f, 0.7054004770f, 0.7076604000f, +0.7099116190f, 0.7121541600f, 0.7143880520f, 0.7166133230f, +0.7188300000f, 0.7210381110f, 0.7232376840f, 0.7254287490f, +0.7276113330f, 0.7297854640f, 0.7319511710f, 0.7341084830f, +0.7362574290f, 0.7383980370f, 0.7405303370f, 0.7426543560f, +0.7447701260f, 0.7468776740f, 0.7489770290f, 0.7510682220f, +0.7531512810f, 0.7552262360f, 0.7572931160f, 0.7593519510f, +0.7614027700f, 0.7634456020f, 0.7654804790f, 0.7675074280f, +0.7695264800f, 0.7715376650f, 0.7735410110f, 0.7755365500f, +0.7775243100f, 0.7795043220f, 0.7814766150f, 0.7834412190f, +0.7853981630f, 0.7853981630f /* Extended by 1 for interpolation */ +}; + +static inline float my_sinf(float x) +{ + float x2, xn; + int i; + /* Map x into the -PI to PI range. We could do this using: + * x = fmodf(x, 2.0f * FZ_PI); + * but that's C99, and seems to misbehave with negative numbers + * on some platforms. */ + x -= FZ_PI; + i = x / (2.0f * FZ_PI); + x -= i * 2.0f * FZ_PI; + if (x < 0.0f) + x += 2.0f * FZ_PI; + x -= FZ_PI; + if (x <= -FZ_PI / 2.0f) + x = -FZ_PI - x; + else if (x >= FZ_PI / 2.0f) + x = FZ_PI-x; + x2 = x * x; + xn = x * x2 / 6.0f; + x -= xn; + xn *= x2 / 20.0f; + x += xn; + xn *= x2 / 42.0f; + x -= xn; + xn *= x2 / 72.0f; + x += xn; + if (x > 1) + x = 1; + else if (x < -1) + x = -1; + return x; +} + +static inline float my_atan2f(float o, float a) +{ + int negate = 0, flip = 0, i; + float r, s; + if (o == 0.0f) + { + if (a > 0) + return 0.0f; + else + return FZ_PI; + } + if (o < 0) + o = -o, negate = 1; + if (a < 0) + a = -a, flip = 1; + if (o < a) + i = 65536.0f * o / a + 0.5f; + else + i = 65536.0f * a / o + 0.5f; + r = my_atan_table[i >> 8]; + s = my_atan_table[(i >> 8) + 1]; + r += (s - r) * (i & 255) / 256.0f; + if (o >= a) + r = FZ_PI / 2.0f - r; + if (flip) + r = FZ_PI - r; + if (negate) + r = -r; + return r; +} + +#define sinf(x) my_sinf(x) +#define cosf(x) my_sinf(FZ_PI / 2.0f + (x)) +#define atan2f(x,y) my_atan2f((x),(y)) +#endif + +static inline int fz_is_pow2(int a) +{ + return (a != 0) && (a & (a-1)) == 0; +} + +#endif diff --git a/include/mupdf/fitz/text.h b/include/mupdf/fitz/text.h new file mode 100644 index 0000000..499a2ae --- /dev/null +++ b/include/mupdf/fitz/text.h @@ -0,0 +1,205 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_TEXT_H +#define MUPDF_FITZ_TEXT_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/font.h" +#include "mupdf/fitz/path.h" +#include "mupdf/fitz/bidi.h" + +/** + Text buffer. + + The trm field contains the a, b, c and d coefficients. + The e and f coefficients come from the individual elements, + together they form the transform matrix for the glyph. + + Glyphs are referenced by glyph ID. + The Unicode text equivalent is kept in a separate array + with indexes into the glyph array. +*/ + +typedef struct +{ + float x, y; + int gid; /* -1 for one gid to many ucs mappings */ + int ucs; /* -1 for one ucs to many gid mappings */ +} fz_text_item; + +#define FZ_LANG_TAG2(c1,c2) ((c1-'a'+1) + ((c2-'a'+1)*27)) +#define FZ_LANG_TAG3(c1,c2,c3) ((c1-'a'+1) + ((c2-'a'+1)*27) + ((c3-'a'+1)*27*27)) + +typedef enum +{ + FZ_LANG_UNSET = 0, + FZ_LANG_ur = FZ_LANG_TAG2('u','r'), + FZ_LANG_urd = FZ_LANG_TAG3('u','r','d'), + FZ_LANG_ko = FZ_LANG_TAG2('k','o'), + FZ_LANG_ja = FZ_LANG_TAG2('j','a'), + FZ_LANG_zh = FZ_LANG_TAG2('z','h'), + FZ_LANG_zh_Hans = FZ_LANG_TAG3('z','h','s'), + FZ_LANG_zh_Hant = FZ_LANG_TAG3('z','h','t'), +} fz_text_language; + +typedef struct fz_text_span +{ + fz_font *font; + fz_matrix trm; + unsigned wmode : 1; /* 0 horizontal, 1 vertical */ + unsigned bidi_level : 7; /* The bidirectional level of text */ + unsigned markup_dir : 2; /* The direction of text as marked in the original document */ + unsigned language : 15; /* The language as marked in the original document */ + int len, cap; + fz_text_item *items; + struct fz_text_span *next; +} fz_text_span; + +typedef struct +{ + int refs; + fz_text_span *head, *tail; +} fz_text; + +/** + Create a new empty fz_text object. + + Throws exception on failure to allocate. +*/ +fz_text *fz_new_text(fz_context *ctx); + +/** + Increment the reference count for the text object. The same + pointer is returned. + + Never throws exceptions. +*/ +fz_text *fz_keep_text(fz_context *ctx, const fz_text *text); + +/** + Decrement the reference count for the text object. When the + reference count hits zero, the text object is freed. + + Never throws exceptions. +*/ +void fz_drop_text(fz_context *ctx, const fz_text *text); + +/** + Add a glyph/unicode value to a text object. + + text: Text object to add to. + + font: The font the glyph should be added in. + + trm: The transform to use for the glyph. + + glyph: The glyph id to add. + + unicode: The unicode character for the glyph. + + wmode: 1 for vertical mode, 0 for horizontal. + + bidi_level: The bidirectional level for this glyph. + + markup_dir: The direction of the text as specified in the + markup. + + language: The language in use (if known, 0 otherwise) + (e.g. FZ_LANG_zh_Hans). + + Throws exception on failure to allocate. +*/ +void fz_show_glyph(fz_context *ctx, fz_text *text, fz_font *font, fz_matrix trm, int glyph, int unicode, int wmode, int bidi_level, fz_bidi_direction markup_dir, fz_text_language language); + +/** + Add a UTF8 string to a text object. + + text: Text object to add to. + + font: The font the string should be added in. + + trm: The transform to use. + + s: The utf-8 string to add. + + wmode: 1 for vertical mode, 0 for horizontal. + + bidi_level: The bidirectional level for this glyph. + + markup_dir: The direction of the text as specified in the markup. + + language: The language in use (if known, 0 otherwise) + (e.g. FZ_LANG_zh_Hans). + + Returns the transform updated with the advance width of the + string. +*/ +fz_matrix fz_show_string(fz_context *ctx, fz_text *text, fz_font *font, fz_matrix trm, const char *s, int wmode, int bidi_level, fz_bidi_direction markup_dir, fz_text_language language); + +/** + Measure the advance width of a UTF8 string should it be added to a text object. + + This uses the same layout algorithms as fz_show_string, and can be used + to calculate text alignment adjustments. +*/ +fz_matrix +fz_measure_string(fz_context *ctx, fz_font *user_font, fz_matrix trm, const char *s, int wmode, int bidi_level, fz_bidi_direction markup_dir, fz_text_language language); + +/** + Find the bounds of a given text object. + + text: The text object to find the bounds of. + + stroke: Pointer to the stroke attributes (for stroked + text), or NULL (for filled text). + + ctm: The matrix in use. + + r: pointer to storage for the bounds. + + Returns a pointer to r, which is updated to contain the + bounding box for the text object. +*/ +fz_rect fz_bound_text(fz_context *ctx, const fz_text *text, const fz_stroke_state *stroke, fz_matrix ctm); + +/** + Convert ISO 639 (639-{1,2,3,5}) language specification + strings losslessly to a 15 bit fz_text_language code. + + No validation is carried out. Obviously invalid (out + of spec) codes will be mapped to FZ_LANG_UNSET, but + well-formed (but undefined) codes will be blithely + accepted. +*/ +fz_text_language fz_text_language_from_string(const char *str); + +/** + Recover ISO 639 (639-{1,2,3,5}) language specification + strings losslessly from a 15 bit fz_text_language code. + + No validation is carried out. See note above. +*/ +char *fz_string_from_text_language(char str[8], fz_text_language lang); + +#endif diff --git a/include/mupdf/fitz/track-usage.h b/include/mupdf/fitz/track-usage.h new file mode 100644 index 0000000..69e8425 --- /dev/null +++ b/include/mupdf/fitz/track-usage.h @@ -0,0 +1,57 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef TRACK_USAGE_H +#define TRACK_USAGE_H + +#ifdef TRACK_USAGE + +typedef struct track_usage_data { + int count; + const char *function; + int line; + const char *desc; + struct track_usage_data *next; +} track_usage_data; + +#define TRACK_LABEL(A) \ + do { \ + static track_usage_data USAGE_DATA = { 0 };\ + track_usage(&USAGE_DATA, __FILE__, __LINE__, A);\ + } while (0) + +#define TRACK_FN() \ + do { \ + static track_usage_data USAGE_DATA = { 0 };\ + track_usage(&USAGE_DATA, __FILE__, __LINE__, __FUNCTION__);\ + } while (0) + +void track_usage(track_usage_data *data, const char *function, int line, const char *desc); + +#else + +#define TRACK_LABEL(A) do { } while (0) +#define TRACK_FN() do { } while (0) + +#endif + +#endif diff --git a/include/mupdf/fitz/transition.h b/include/mupdf/fitz/transition.h new file mode 100644 index 0000000..89a8087 --- /dev/null +++ b/include/mupdf/fitz/transition.h @@ -0,0 +1,76 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_TRANSITION_H +#define MUPDF_FITZ_TRANSITION_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/pixmap.h" + +/* Transition support */ +enum { + FZ_TRANSITION_NONE = 0, /* aka 'R' or 'REPLACE' */ + FZ_TRANSITION_SPLIT, + FZ_TRANSITION_BLINDS, + FZ_TRANSITION_BOX, + FZ_TRANSITION_WIPE, + FZ_TRANSITION_DISSOLVE, + FZ_TRANSITION_GLITTER, + FZ_TRANSITION_FLY, + FZ_TRANSITION_PUSH, + FZ_TRANSITION_COVER, + FZ_TRANSITION_UNCOVER, + FZ_TRANSITION_FADE +}; + +typedef struct +{ + int type; + float duration; /* Effect duration (seconds) */ + + /* Parameters controlling the effect */ + int vertical; /* 0 or 1 */ + int outwards; /* 0 or 1 */ + int direction; /* Degrees */ + /* Potentially more to come */ + + /* State variables for use of the transition code */ + int state0; + int state1; +} fz_transition; + +/** + Generate a frame of a transition. + + tpix: Target pixmap + opix: Old pixmap + npix: New pixmap + time: Position within the transition (0 to 256) + trans: Transition details + + Returns 1 if successfully generated a frame. + + Note: Pixmaps must include alpha. +*/ +int fz_generate_transition(fz_context *ctx, fz_pixmap *tpix, fz_pixmap *opix, fz_pixmap *npix, int time, fz_transition *trans); + +#endif diff --git a/include/mupdf/fitz/tree.h b/include/mupdf/fitz/tree.h new file mode 100644 index 0000000..b4d7ac6 --- /dev/null +++ b/include/mupdf/fitz/tree.h @@ -0,0 +1,62 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_TREE_H +#define MUPDF_FITZ_TREE_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" + +/** + AA-tree to look up things by strings. +*/ + +typedef struct fz_tree fz_tree; + +/** + Look for the value of a node in the tree with the given key. + + Simple pointer equivalence is used for key. + + Returns NULL for no match. +*/ +void *fz_tree_lookup(fz_context *ctx, fz_tree *node, const char *key); + +/** + Insert a new key/value pair and rebalance the tree. + Return the new root of the tree after inserting and rebalancing. + May be called with a NULL root to create a new tree. + + No data is copied into the tree structure; key and value are + merely kept as pointers. +*/ +fz_tree *fz_tree_insert(fz_context *ctx, fz_tree *root, const char *key, void *value); + +/** + Drop the tree. + + The storage used by the tree is freed, and each value has + dropfunc called on it. +*/ +void fz_drop_tree(fz_context *ctx, fz_tree *node, void (*dropfunc)(fz_context *ctx, void *value)); + +#endif diff --git a/include/mupdf/fitz/types.h b/include/mupdf/fitz/types.h new file mode 100644 index 0000000..1299d2a --- /dev/null +++ b/include/mupdf/fitz/types.h @@ -0,0 +1,41 @@ +// Copyright (C) 2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_TYPES_H +#define MUPDF_FITZ_TYPES_H + +typedef struct fz_document fz_document; + +/** + Locations within the document are referred to in terms of + chapter and page, rather than just a page number. For some + documents (such as epub documents with large numbers of pages + broken into many chapters) this can make navigation much faster + as only the required chapter needs to be decoded at a time. +*/ +typedef struct +{ + int chapter; + int page; +} fz_location; + +#endif diff --git a/include/mupdf/fitz/util.h b/include/mupdf/fitz/util.h new file mode 100644 index 0000000..d8a3504 --- /dev/null +++ b/include/mupdf/fitz/util.h @@ -0,0 +1,151 @@ +// Copyright (C) 2004-2022 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_UTIL_H +#define MUPDF_FITZ_UTIL_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/geometry.h" +#include "mupdf/fitz/document.h" +#include "mupdf/fitz/pixmap.h" +#include "mupdf/fitz/structured-text.h" +#include "mupdf/fitz/buffer.h" +#include "mupdf/fitz/xml.h" +#include "mupdf/fitz/archive.h" +#include "mupdf/fitz/display-list.h" + +/** + Create a display list. + + Ownership of the display list is returned to the caller. +*/ +fz_display_list *fz_new_display_list_from_page(fz_context *ctx, fz_page *page); +fz_display_list *fz_new_display_list_from_page_number(fz_context *ctx, fz_document *doc, int number); + +/** + Create a display list from page contents (no annotations). + + Ownership of the display list is returned to the caller. +*/ +fz_display_list *fz_new_display_list_from_page_contents(fz_context *ctx, fz_page *page); + +/** + Render the page to a pixmap using the transform and colorspace. + + Ownership of the pixmap is returned to the caller. +*/ +fz_pixmap *fz_new_pixmap_from_display_list(fz_context *ctx, fz_display_list *list, fz_matrix ctm, fz_colorspace *cs, int alpha); +fz_pixmap *fz_new_pixmap_from_page(fz_context *ctx, fz_page *page, fz_matrix ctm, fz_colorspace *cs, int alpha); +fz_pixmap *fz_new_pixmap_from_page_number(fz_context *ctx, fz_document *doc, int number, fz_matrix ctm, fz_colorspace *cs, int alpha); + +/** + Render the page contents without annotations. + + Ownership of the pixmap is returned to the caller. +*/ +fz_pixmap *fz_new_pixmap_from_page_contents(fz_context *ctx, fz_page *page, fz_matrix ctm, fz_colorspace *cs, int alpha); + +/** + Render the page contents with control over spot colors. + + Ownership of the pixmap is returned to the caller. +*/ +fz_pixmap *fz_new_pixmap_from_display_list_with_separations(fz_context *ctx, fz_display_list *list, fz_matrix ctm, fz_colorspace *cs, fz_separations *seps, int alpha); +fz_pixmap *fz_new_pixmap_from_page_with_separations(fz_context *ctx, fz_page *page, fz_matrix ctm, fz_colorspace *cs, fz_separations *seps, int alpha); +fz_pixmap *fz_new_pixmap_from_page_number_with_separations(fz_context *ctx, fz_document *doc, int number, fz_matrix ctm, fz_colorspace *cs, fz_separations *seps, int alpha); +fz_pixmap *fz_new_pixmap_from_page_contents_with_separations(fz_context *ctx, fz_page *page, fz_matrix ctm, fz_colorspace *cs, fz_separations *seps, int alpha); + +fz_pixmap *fz_fill_pixmap_from_display_list(fz_context *ctx, fz_display_list *list, fz_matrix ctm, fz_pixmap *pix); + +/** + Extract text from page. + + Ownership of the fz_stext_page is returned to the caller. +*/ +fz_stext_page *fz_new_stext_page_from_page(fz_context *ctx, fz_page *page, const fz_stext_options *options); +fz_stext_page *fz_new_stext_page_from_page_number(fz_context *ctx, fz_document *doc, int number, const fz_stext_options *options); +fz_stext_page *fz_new_stext_page_from_chapter_page_number(fz_context *ctx, fz_document *doc, int chapter, int number, const fz_stext_options *options); +fz_stext_page *fz_new_stext_page_from_display_list(fz_context *ctx, fz_display_list *list, const fz_stext_options *options); + +/** + Convert structured text into plain text. +*/ +fz_buffer *fz_new_buffer_from_stext_page(fz_context *ctx, fz_stext_page *text); +fz_buffer *fz_new_buffer_from_page(fz_context *ctx, fz_page *page, const fz_stext_options *options); +fz_buffer *fz_new_buffer_from_page_number(fz_context *ctx, fz_document *doc, int number, const fz_stext_options *options); +fz_buffer *fz_new_buffer_from_display_list(fz_context *ctx, fz_display_list *list, const fz_stext_options *options); + +/** + Search for the 'needle' text on the page. + Record the hits in the hit_bbox array and return the number of + hits. Will stop looking once it has filled hit_max rectangles. +*/ +int fz_search_page(fz_context *ctx, fz_page *page, const char *needle, int *hit_mark, fz_quad *hit_bbox, int hit_max); +int fz_search_page_number(fz_context *ctx, fz_document *doc, int number, const char *needle, int *hit_mark, fz_quad *hit_bbox, int hit_max); +int fz_search_chapter_page_number(fz_context *ctx, fz_document *doc, int chapter, int page, const char *needle, int *hit_mark, fz_quad *hit_bbox, int hit_max); +int fz_search_display_list(fz_context *ctx, fz_display_list *list, const char *needle, int *hit_mark, fz_quad *hit_bbox, int hit_max); + +/** + Parse an SVG document into a display-list. +*/ +fz_display_list *fz_new_display_list_from_svg(fz_context *ctx, fz_buffer *buf, const char *base_uri, fz_archive *zip, float *w, float *h); + +/** + Create a scalable image from an SVG document. +*/ +fz_image *fz_new_image_from_svg(fz_context *ctx, fz_buffer *buf, const char *base_uri, fz_archive *zip); + +/** + Parse an SVG document into a display-list. +*/ +fz_display_list *fz_new_display_list_from_svg_xml(fz_context *ctx, fz_xml_doc *xmldoc, fz_xml *xml, const char *base_uri, fz_archive *zip, float *w, float *h); + +/** + Create a scalable image from an SVG document. +*/ +fz_image *fz_new_image_from_svg_xml(fz_context *ctx, fz_xml_doc *xmldoc, fz_xml *xml, const char *base_uri, fz_archive *zip); + +/** + Write image as a data URI (for HTML and SVG output). +*/ +void fz_write_image_as_data_uri(fz_context *ctx, fz_output *out, fz_image *image); +void fz_write_pixmap_as_data_uri(fz_context *ctx, fz_output *out, fz_pixmap *pixmap); +void fz_append_image_as_data_uri(fz_context *ctx, fz_buffer *out, fz_image *image); +void fz_append_pixmap_as_data_uri(fz_context *ctx, fz_buffer *out, fz_pixmap *pixmap); + +/** + Use text extraction to convert the input document into XHTML, + then open the result as a new document that can be reflowed. +*/ +fz_document *fz_new_xhtml_document_from_document(fz_context *ctx, fz_document *old_doc, const fz_stext_options *opts); + +/** + Returns an fz_buffer containing a page after conversion to specified format. + + page: The page to convert. + format, options: Passed to fz_new_document_writer_with_output() internally. + transform, cookie: Passed to fz_run_page() internally. +*/ +fz_buffer *fz_new_buffer_from_page_with_format(fz_context *ctx, fz_page *page, const char *format, const char *options, fz_matrix transform, fz_cookie *cookie); + +#endif diff --git a/include/mupdf/fitz/version.h b/include/mupdf/fitz/version.h new file mode 100644 index 0000000..c3cad88 --- /dev/null +++ b/include/mupdf/fitz/version.h @@ -0,0 +1,31 @@ +// Copyright (C) 2004-2022 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_VERSION_H +#define MUPDF_FITZ_VERSION_H +#ifndef FZ_VERSION +#define FZ_VERSION "1.23.0" +#define FZ_VERSION_MAJOR 1 +#define FZ_VERSION_MINOR 23 +#define FZ_VERSION_PATCH 0 +#endif +#endif diff --git a/include/mupdf/fitz/write-pixmap.h b/include/mupdf/fitz/write-pixmap.h new file mode 100644 index 0000000..8d618f7 --- /dev/null +++ b/include/mupdf/fitz/write-pixmap.h @@ -0,0 +1,476 @@ +// Copyright (C) 2004-2023 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_WRITE_PIXMAP_H +#define MUPDF_FITZ_WRITE_PIXMAP_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/output.h" +#include "mupdf/fitz/band-writer.h" +#include "mupdf/fitz/pixmap.h" +#include "mupdf/fitz/bitmap.h" +#include "mupdf/fitz/buffer.h" +#include "mupdf/fitz/image.h" +#include "mupdf/fitz/writer.h" + +/** + PCL output +*/ +typedef struct +{ + /* Features of a particular printer */ + int features; + const char *odd_page_init; + const char *even_page_init; + + /* Options for this job */ + int tumble; + int duplex_set; + int duplex; + int paper_size; + int manual_feed_set; + int manual_feed; + int media_position_set; + int media_position; + int orientation; + + /* Updated as we move through the job */ + int page_count; +} fz_pcl_options; + +/** + Initialize PCL option struct for a given preset. + + Currently defined presets include: + + generic Generic PCL printer + ljet4 HP DeskJet + dj500 HP DeskJet 500 + fs600 Kyocera FS-600 + lj HP LaserJet, HP LaserJet Plus + lj2 HP LaserJet IIp, HP LaserJet IId + lj3 HP LaserJet III + lj3d HP LaserJet IIId + lj4 HP LaserJet 4 + lj4pl HP LaserJet 4 PL + lj4d HP LaserJet 4d + lp2563b HP 2563B line printer + oce9050 Oce 9050 Line printer +*/ +void fz_pcl_preset(fz_context *ctx, fz_pcl_options *opts, const char *preset); + +/** + Parse PCL options. + + Currently defined options and values are as follows: + + preset=X Either "generic" or one of the presets as for fz_pcl_preset. + spacing=0 No vertical spacing capability + spacing=1 PCL 3 spacing (*p+Y) + spacing=2 PCL 4 spacing (*bY) + spacing=3 PCL 5 spacing (*bY and clear seed row) + mode2 Disable/Enable mode 2 graphics compression + mode3 Disable/Enable mode 3 graphics compression + eog_reset End of graphics (*rB) resets all parameters + has_duplex Duplex supported (&lS) + has_papersize Papersize setting supported (&lA) + has_copies Number of copies supported (&lX) + is_ljet4pjl Disable/Enable HP 4PJL model-specific output + is_oce9050 Disable/Enable Oce 9050 model-specific output +*/ +fz_pcl_options *fz_parse_pcl_options(fz_context *ctx, fz_pcl_options *opts, const char *args); + +/** + Create a new band writer, outputing monochrome pcl. +*/ +fz_band_writer *fz_new_mono_pcl_band_writer(fz_context *ctx, fz_output *out, const fz_pcl_options *options); + +/** + Write a bitmap as mono PCL. +*/ +void fz_write_bitmap_as_pcl(fz_context *ctx, fz_output *out, const fz_bitmap *bitmap, const fz_pcl_options *pcl); + +/** + Save a bitmap as mono PCL. +*/ +void fz_save_bitmap_as_pcl(fz_context *ctx, fz_bitmap *bitmap, char *filename, int append, const fz_pcl_options *pcl); + +/** + Create a new band writer, outputing color pcl. +*/ +fz_band_writer *fz_new_color_pcl_band_writer(fz_context *ctx, fz_output *out, const fz_pcl_options *options); + +/** + Write an (RGB) pixmap as color PCL. +*/ +void fz_write_pixmap_as_pcl(fz_context *ctx, fz_output *out, const fz_pixmap *pixmap, const fz_pcl_options *pcl); + +/** + Save an (RGB) pixmap as color PCL. +*/ +void fz_save_pixmap_as_pcl(fz_context *ctx, fz_pixmap *pixmap, char *filename, int append, const fz_pcl_options *pcl); + +/** + PCLm output +*/ +typedef struct +{ + int compress; + int strip_height; + + /* Updated as we move through the job */ + int page_count; +} fz_pclm_options; + +/** + Parse PCLm options. + + Currently defined options and values are as follows: + + compression=none: No compression + compression=flate: Flate compression + strip-height=n: Strip height (default 16) +*/ +fz_pclm_options *fz_parse_pclm_options(fz_context *ctx, fz_pclm_options *opts, const char *args); + +/** + Create a new band writer, outputing pclm +*/ +fz_band_writer *fz_new_pclm_band_writer(fz_context *ctx, fz_output *out, const fz_pclm_options *options); + +/** + Write a (Greyscale or RGB) pixmap as pclm. +*/ +void fz_write_pixmap_as_pclm(fz_context *ctx, fz_output *out, const fz_pixmap *pixmap, const fz_pclm_options *options); + +/** + Save a (Greyscale or RGB) pixmap as pclm. +*/ +void fz_save_pixmap_as_pclm(fz_context *ctx, fz_pixmap *pixmap, char *filename, int append, const fz_pclm_options *options); + +/** + PDFOCR output +*/ +typedef struct +{ + int compress; + int strip_height; + char language[256]; + char datadir[1024]; + + /* Updated as we move through the job */ + int page_count; +} fz_pdfocr_options; + +/** + Parse PDFOCR options. + + Currently defined options and values are as follows: + + compression=none: No compression + compression=flate: Flate compression + strip-height=n: Strip height (default 16) + ocr-language=: OCR Language (default eng) + ocr-datadir=: OCR data path (default rely on TESSDATA_PREFIX) +*/ +fz_pdfocr_options *fz_parse_pdfocr_options(fz_context *ctx, fz_pdfocr_options *opts, const char *args); + +/** + Create a new band writer, outputing pdfocr. + + Ownership of output stays with the caller, the band writer + borrows the reference. The caller must keep the output around + for the duration of the band writer, and then close/drop as + appropriate. +*/ +fz_band_writer *fz_new_pdfocr_band_writer(fz_context *ctx, fz_output *out, const fz_pdfocr_options *options); + +/** + Set the progress callback for a pdfocr bandwriter. +*/ +void fz_pdfocr_band_writer_set_progress(fz_context *ctx, fz_band_writer *writer, fz_pdfocr_progress_fn *progress_fn, void *progress_arg); + +/** + Write a (Greyscale or RGB) pixmap as pdfocr. +*/ +void fz_write_pixmap_as_pdfocr(fz_context *ctx, fz_output *out, const fz_pixmap *pixmap, const fz_pdfocr_options *options); + +/** + Save a (Greyscale or RGB) pixmap as pdfocr. +*/ +void fz_save_pixmap_as_pdfocr(fz_context *ctx, fz_pixmap *pixmap, char *filename, int append, const fz_pdfocr_options *options); + +/** + Save a (Greyscale or RGB) pixmap as a png. +*/ +void fz_save_pixmap_as_png(fz_context *ctx, fz_pixmap *pixmap, const char *filename); + +/** + Save a pixmap as a JPEG. +*/ +void fz_save_pixmap_as_jpeg(fz_context *ctx, fz_pixmap *pixmap, const char *filename, int quality); + +/** + Write a (Greyscale or RGB) pixmap as a png. +*/ +void fz_write_pixmap_as_png(fz_context *ctx, fz_output *out, const fz_pixmap *pixmap); + +/** + Create a new png band writer (greyscale or RGB, with or without + alpha). +*/ +fz_band_writer *fz_new_png_band_writer(fz_context *ctx, fz_output *out); + +/** + Reencode a given image as a PNG into a buffer. + + Ownership of the buffer is returned. +*/ +fz_buffer *fz_new_buffer_from_image_as_png(fz_context *ctx, fz_image *image, fz_color_params color_params); +fz_buffer *fz_new_buffer_from_image_as_pnm(fz_context *ctx, fz_image *image, fz_color_params color_params); +fz_buffer *fz_new_buffer_from_image_as_pam(fz_context *ctx, fz_image *image, fz_color_params color_params); +fz_buffer *fz_new_buffer_from_image_as_psd(fz_context *ctx, fz_image *image, fz_color_params color_params); +fz_buffer *fz_new_buffer_from_image_as_jpeg(fz_context *ctx, fz_image *image, fz_color_params color_params, int quality); + +/** + Reencode a given pixmap as a PNG into a buffer. + + Ownership of the buffer is returned. +*/ +fz_buffer *fz_new_buffer_from_pixmap_as_png(fz_context *ctx, fz_pixmap *pixmap, fz_color_params color_params); +fz_buffer *fz_new_buffer_from_pixmap_as_pnm(fz_context *ctx, fz_pixmap *pixmap, fz_color_params color_params); +fz_buffer *fz_new_buffer_from_pixmap_as_pam(fz_context *ctx, fz_pixmap *pixmap, fz_color_params color_params); +fz_buffer *fz_new_buffer_from_pixmap_as_psd(fz_context *ctx, fz_pixmap *pix, fz_color_params color_params); +fz_buffer *fz_new_buffer_from_pixmap_as_jpeg(fz_context *ctx, fz_pixmap *pixmap, fz_color_params color_params, int quality); + +/** + Save a pixmap as a pnm (greyscale or rgb, no alpha). +*/ +void fz_save_pixmap_as_pnm(fz_context *ctx, fz_pixmap *pixmap, const char *filename); + +/** + Write a pixmap as a pnm (greyscale or rgb, no alpha). +*/ +void fz_write_pixmap_as_pnm(fz_context *ctx, fz_output *out, fz_pixmap *pixmap); + +/** + Create a band writer targetting pnm (greyscale or rgb, no + alpha). +*/ +fz_band_writer *fz_new_pnm_band_writer(fz_context *ctx, fz_output *out); + +/** + Save a pixmap as a pnm (greyscale, rgb or cmyk, with or without + alpha). +*/ +void fz_save_pixmap_as_pam(fz_context *ctx, fz_pixmap *pixmap, const char *filename); + +/** + Write a pixmap as a pnm (greyscale, rgb or cmyk, with or without + alpha). +*/ +void fz_write_pixmap_as_pam(fz_context *ctx, fz_output *out, fz_pixmap *pixmap); + +/** + Create a band writer targetting pnm (greyscale, rgb or cmyk, + with or without alpha). +*/ +fz_band_writer *fz_new_pam_band_writer(fz_context *ctx, fz_output *out); + +/** + Save a bitmap as a pbm. +*/ +void fz_save_bitmap_as_pbm(fz_context *ctx, fz_bitmap *bitmap, const char *filename); + +/** + Write a bitmap as a pbm. +*/ +void fz_write_bitmap_as_pbm(fz_context *ctx, fz_output *out, fz_bitmap *bitmap); + +/** + Create a new band writer, targetting pbm. +*/ +fz_band_writer *fz_new_pbm_band_writer(fz_context *ctx, fz_output *out); + +/** + Save a pixmap as a pbm. (Performing halftoning). +*/ +void fz_save_pixmap_as_pbm(fz_context *ctx, fz_pixmap *pixmap, const char *filename); + +/** + Save a CMYK bitmap as a pkm. +*/ +void fz_save_bitmap_as_pkm(fz_context *ctx, fz_bitmap *bitmap, const char *filename); + +/** + Write a CMYK bitmap as a pkm. +*/ +void fz_write_bitmap_as_pkm(fz_context *ctx, fz_output *out, fz_bitmap *bitmap); + +/** + Create a new pkm band writer for CMYK pixmaps. +*/ +fz_band_writer *fz_new_pkm_band_writer(fz_context *ctx, fz_output *out); + +/** + Save a CMYK pixmap as a pkm. (Performing halftoning). +*/ +void fz_save_pixmap_as_pkm(fz_context *ctx, fz_pixmap *pixmap, const char *filename); + +/** + Write a (gray, rgb, or cmyk, no alpha) pixmap out as postscript. +*/ +void fz_write_pixmap_as_ps(fz_context *ctx, fz_output *out, const fz_pixmap *pixmap); + +/** + Save a (gray, rgb, or cmyk, no alpha) pixmap out as postscript. +*/ +void fz_save_pixmap_as_ps(fz_context *ctx, fz_pixmap *pixmap, char *filename, int append); + +/** + Create a postscript band writer for gray, rgb, or cmyk, no + alpha. +*/ +fz_band_writer *fz_new_ps_band_writer(fz_context *ctx, fz_output *out); + +/** + Write the file level header for ps band writer output. +*/ +void fz_write_ps_file_header(fz_context *ctx, fz_output *out); + +/** + Write the file level trailer for ps band writer output. +*/ +void fz_write_ps_file_trailer(fz_context *ctx, fz_output *out, int pages); + +/** + Save a pixmap as a PSD file. +*/ +void fz_save_pixmap_as_psd(fz_context *ctx, fz_pixmap *pixmap, const char *filename); + +/** + Write a pixmap as a PSD file. +*/ +void fz_write_pixmap_as_psd(fz_context *ctx, fz_output *out, const fz_pixmap *pixmap); + +/** + Open a PSD band writer. +*/ +fz_band_writer *fz_new_psd_band_writer(fz_context *ctx, fz_output *out); + +typedef struct +{ + /* These are not interpreted as CStrings by the writing code, + * but are rather copied directly out. */ + char media_class[64]; + char media_color[64]; + char media_type[64]; + char output_type[64]; + + unsigned int advance_distance; + int advance_media; + int collate; + int cut_media; + int duplex; + int insert_sheet; + int jog; + int leading_edge; + int manual_feed; + unsigned int media_position; + unsigned int media_weight; + int mirror_print; + int negative_print; + unsigned int num_copies; + int orientation; + int output_face_up; + unsigned int PageSize[2]; + int separations; + int tray_switch; + int tumble; + + int media_type_num; + int compression; + unsigned int row_count; + unsigned int row_feed; + unsigned int row_step; + + /* These are not interpreted as CStrings by the writing code, but + * are rather copied directly out. */ + char rendering_intent[64]; + char page_size_name[64]; +} fz_pwg_options; + +/** + Save a pixmap as a PWG. +*/ +void fz_save_pixmap_as_pwg(fz_context *ctx, fz_pixmap *pixmap, char *filename, int append, const fz_pwg_options *pwg); + +/** + Save a bitmap as a PWG. +*/ +void fz_save_bitmap_as_pwg(fz_context *ctx, fz_bitmap *bitmap, char *filename, int append, const fz_pwg_options *pwg); + +/** + Write a pixmap as a PWG. +*/ +void fz_write_pixmap_as_pwg(fz_context *ctx, fz_output *out, const fz_pixmap *pixmap, const fz_pwg_options *pwg); + +/** + Write a bitmap as a PWG. +*/ +void fz_write_bitmap_as_pwg(fz_context *ctx, fz_output *out, const fz_bitmap *bitmap, const fz_pwg_options *pwg); + +/** + Write a pixmap as a PWG page. + + Caller should provide a file header by calling + fz_write_pwg_file_header, but can then write several pages to + the same file. +*/ +void fz_write_pixmap_as_pwg_page(fz_context *ctx, fz_output *out, const fz_pixmap *pixmap, const fz_pwg_options *pwg); + +/** + Write a bitmap as a PWG page. + + Caller should provide a file header by calling + fz_write_pwg_file_header, but can then write several pages to + the same file. +*/ +void fz_write_bitmap_as_pwg_page(fz_context *ctx, fz_output *out, const fz_bitmap *bitmap, const fz_pwg_options *pwg); + +/** + Create a new monochrome pwg band writer. +*/ +fz_band_writer *fz_new_mono_pwg_band_writer(fz_context *ctx, fz_output *out, const fz_pwg_options *pwg); + +/** + Create a new color pwg band writer. +*/ +fz_band_writer *fz_new_pwg_band_writer(fz_context *ctx, fz_output *out, const fz_pwg_options *pwg); + +/** + Output the file header to a pwg stream, ready for pages to follow it. +*/ +void fz_write_pwg_file_header(fz_context *ctx, fz_output *out); /* for use by mudraw.c */ + +#endif diff --git a/include/mupdf/fitz/writer.h b/include/mupdf/fitz/writer.h new file mode 100644 index 0000000..ed7f529 --- /dev/null +++ b/include/mupdf/fitz/writer.h @@ -0,0 +1,265 @@ +// Copyright (C) 2004-2023 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_WRITER_H +#define MUPDF_FITZ_WRITER_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/output.h" +#include "mupdf/fitz/document.h" +#include "mupdf/fitz/device.h" + +typedef struct fz_document_writer fz_document_writer; + +/** + Function type to start + the process of writing a page to a document. + + mediabox: page size rectangle in points. + + Returns a fz_device to write page contents to. +*/ +typedef fz_device *(fz_document_writer_begin_page_fn)(fz_context *ctx, fz_document_writer *wri, fz_rect mediabox); + +/** + Function type to end the + process of writing a page to a document. + + dev: The device created by the begin_page function. +*/ +typedef void (fz_document_writer_end_page_fn)(fz_context *ctx, fz_document_writer *wri, fz_device *dev); + +/** + Function type to end + the process of writing pages to a document. + + This writes any file level trailers required. After this + completes successfully the file is up to date and complete. +*/ +typedef void (fz_document_writer_close_writer_fn)(fz_context *ctx, fz_document_writer *wri); + +/** + Function type to discard + an fz_document_writer. This may be called at any time during + the process to release all the resources owned by the writer. + + Calling drop without having previously called close may leave + the file in an inconsistent state and the user of the + fz_document_writer would need to do any cleanup required. +*/ +typedef void (fz_document_writer_drop_writer_fn)(fz_context *ctx, fz_document_writer *wri); + +#define fz_new_derived_document_writer(CTX,TYPE,BEGIN_PAGE,END_PAGE,CLOSE,DROP) \ + ((TYPE *)Memento_label(fz_new_document_writer_of_size(CTX,sizeof(TYPE),BEGIN_PAGE,END_PAGE,CLOSE,DROP),#TYPE)) + +/** + Look for a given option (key) in the opts string. Return 1 if + it has it, and update *val to point to the value within opts. +*/ +int fz_has_option(fz_context *ctx, const char *opts, const char *key, const char **val); + +/** + Check to see if an option, a, from a string matches a reference + option, b. + + (i.e. a could be 'foo' or 'foo,bar...' etc, but b can only be + 'foo'.) +*/ +int fz_option_eq(const char *a, const char *b); + +/** + Copy an option (val) into a destination buffer (dest), of maxlen + bytes. + + Returns the number of bytes (including terminator) that did not + fit. If val is maxlen or greater bytes in size, it will be left + unterminated. +*/ +size_t fz_copy_option(fz_context *ctx, const char *val, char *dest, size_t maxlen); + +/** + Create a new fz_document_writer, for a + file of the given type. + + path: The document name to write (or NULL for default) + + format: Which format to write (currently cbz, html, pdf, pam, + pbm, pgm, pkm, png, ppm, pnm, svg, text, xhtml, docx, odt) + + options: NULL, or pointer to comma separated string to control + file generation. +*/ +fz_document_writer *fz_new_document_writer(fz_context *ctx, const char *path, const char *format, const char *options); + +/** + Like fz_new_document_writer but takes a fz_output for writing + the result. Only works for multi-page formats. +*/ +fz_document_writer * +fz_new_document_writer_with_output(fz_context *ctx, fz_output *out, const char *format, const char *options); + +fz_document_writer * +fz_new_document_writer_with_buffer(fz_context *ctx, fz_buffer *buf, const char *format, const char *options); + +/** + Document writers for various possible output formats. + + All of the "_with_output" variants pass the ownership of out in + immediately upon calling. The writers are responsible for + dropping the fz_output when they are finished with it (even + if they throw an exception during creation). +*/ +fz_document_writer *fz_new_pdf_writer(fz_context *ctx, const char *path, const char *options); +fz_document_writer *fz_new_pdf_writer_with_output(fz_context *ctx, fz_output *out, const char *options); +fz_document_writer *fz_new_svg_writer(fz_context *ctx, const char *path, const char *options); + +fz_document_writer *fz_new_text_writer(fz_context *ctx, const char *format, const char *path, const char *options); +fz_document_writer *fz_new_text_writer_with_output(fz_context *ctx, const char *format, fz_output *out, const char *options); + +fz_document_writer *fz_new_odt_writer(fz_context *ctx, const char *path, const char *options); +fz_document_writer *fz_new_odt_writer_with_output(fz_context *ctx, fz_output *out, const char *options); +fz_document_writer *fz_new_docx_writer(fz_context *ctx, const char *path, const char *options); +fz_document_writer *fz_new_docx_writer_with_output(fz_context *ctx, fz_output *out, const char *options); + +fz_document_writer *fz_new_ps_writer(fz_context *ctx, const char *path, const char *options); +fz_document_writer *fz_new_ps_writer_with_output(fz_context *ctx, fz_output *out, const char *options); +fz_document_writer *fz_new_pcl_writer(fz_context *ctx, const char *path, const char *options); +fz_document_writer *fz_new_pcl_writer_with_output(fz_context *ctx, fz_output *out, const char *options); +fz_document_writer *fz_new_pclm_writer(fz_context *ctx, const char *path, const char *options); +fz_document_writer *fz_new_pclm_writer_with_output(fz_context *ctx, fz_output *out, const char *options); +fz_document_writer *fz_new_pwg_writer(fz_context *ctx, const char *path, const char *options); +fz_document_writer *fz_new_pwg_writer_with_output(fz_context *ctx, fz_output *out, const char *options); + +fz_document_writer *fz_new_cbz_writer(fz_context *ctx, const char *path, const char *options); +fz_document_writer *fz_new_cbz_writer_with_output(fz_context *ctx, fz_output *out, const char *options); + +/** + Used to report progress of the OCR operation. + + page: Current page being processed. + + percent: Progress of the OCR operation for the + current page in percent. Whether it reaches 100 + once a page is finished, depends on the OCR engine. + + Return 0 to continue progress, return 1 to cancel the + operation. +*/ +typedef int (fz_pdfocr_progress_fn)(fz_context *ctx, void *progress_arg, int page, int percent); + +fz_document_writer *fz_new_pdfocr_writer(fz_context *ctx, const char *path, const char *options); +fz_document_writer *fz_new_pdfocr_writer_with_output(fz_context *ctx, fz_output *out, const char *options); +void fz_pdfocr_writer_set_progress(fz_context *ctx, fz_document_writer *writer, fz_pdfocr_progress_fn *progress, void *); + +fz_document_writer *fz_new_jpeg_pixmap_writer(fz_context *ctx, const char *path, const char *options); +fz_document_writer *fz_new_png_pixmap_writer(fz_context *ctx, const char *path, const char *options); +fz_document_writer *fz_new_pam_pixmap_writer(fz_context *ctx, const char *path, const char *options); +fz_document_writer *fz_new_pnm_pixmap_writer(fz_context *ctx, const char *path, const char *options); +fz_document_writer *fz_new_pgm_pixmap_writer(fz_context *ctx, const char *path, const char *options); +fz_document_writer *fz_new_ppm_pixmap_writer(fz_context *ctx, const char *path, const char *options); +fz_document_writer *fz_new_pbm_pixmap_writer(fz_context *ctx, const char *path, const char *options); +fz_document_writer *fz_new_pkm_pixmap_writer(fz_context *ctx, const char *path, const char *options); + +/** + Called to start the process of writing a page to + a document. + + mediabox: page size rectangle in points. + + Returns a borrowed fz_device to write page contents to. This + should be kept if required, and only dropped if it was kept. +*/ +fz_device *fz_begin_page(fz_context *ctx, fz_document_writer *wri, fz_rect mediabox); + +/** + Called to end the process of writing a page to a + document. +*/ +void fz_end_page(fz_context *ctx, fz_document_writer *wri); + +/** + Convenience function to feed all the pages of a document to + fz_begin_page/fz_run_page/fz_end_page. +*/ +void fz_write_document(fz_context *ctx, fz_document_writer *wri, fz_document *doc); + +/** + Called to end the process of writing + pages to a document. + + This writes any file level trailers required. After this + completes successfully the file is up to date and complete. +*/ +void fz_close_document_writer(fz_context *ctx, fz_document_writer *wri); + +/** + Called to discard a fz_document_writer. + This may be called at any time during the process to release all + the resources owned by the writer. + + Calling drop without having previously called close may leave + the file in an inconsistent state. +*/ +void fz_drop_document_writer(fz_context *ctx, fz_document_writer *wri); + +fz_document_writer *fz_new_pixmap_writer(fz_context *ctx, const char *path, const char *options, const char *default_path, int n, + void (*save)(fz_context *ctx, fz_pixmap *pix, const char *filename)); + +FZ_DATA extern const char *fz_pdf_write_options_usage; +FZ_DATA extern const char *fz_svg_write_options_usage; + +FZ_DATA extern const char *fz_pcl_write_options_usage; +FZ_DATA extern const char *fz_pclm_write_options_usage; +FZ_DATA extern const char *fz_pwg_write_options_usage; +FZ_DATA extern const char *fz_pdfocr_write_options_usage; + +/* Implementation details: subject to change. */ + +/** + Structure is public to allow other structures to + be derived from it. Do not access members directly. +*/ +struct fz_document_writer +{ + fz_document_writer_begin_page_fn *begin_page; + fz_document_writer_end_page_fn *end_page; + fz_document_writer_close_writer_fn *close_writer; + fz_document_writer_drop_writer_fn *drop_writer; + fz_device *dev; +}; + +/** + Internal function to allocate a + block for a derived document_writer structure, with the base + structure's function pointers populated correctly, and the extra + space zero initialised. +*/ +fz_document_writer *fz_new_document_writer_of_size(fz_context *ctx, size_t size, + fz_document_writer_begin_page_fn *begin_page, + fz_document_writer_end_page_fn *end_page, + fz_document_writer_close_writer_fn *close, + fz_document_writer_drop_writer_fn *drop); + + + +#endif diff --git a/include/mupdf/fitz/xml.h b/include/mupdf/fitz/xml.h new file mode 100644 index 0000000..642249d --- /dev/null +++ b/include/mupdf/fitz/xml.h @@ -0,0 +1,391 @@ +// Copyright (C) 2004-2022 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_FITZ_XML_H +#define MUPDF_FITZ_XML_H + +#include "mupdf/fitz/system.h" +#include "mupdf/fitz/context.h" +#include "mupdf/fitz/buffer.h" +#include "mupdf/fitz/pool.h" +#include "mupdf/fitz/archive.h" + +/** + XML document model +*/ + +typedef struct fz_xml fz_xml; + +/* For backwards compatibility */ +typedef fz_xml fz_xml_doc; + +/** + Parse the contents of buffer into a tree of xml nodes. + + preserve_white: whether to keep or delete all-whitespace nodes. +*/ +fz_xml *fz_parse_xml(fz_context *ctx, fz_buffer *buf, int preserve_white); + +/** + Parse the contents of buffer into a tree of xml nodes. + + preserve_white: whether to keep or delete all-whitespace nodes. +*/ +fz_xml *fz_parse_xml_stream(fz_context *ctx, fz_stream *stream, int preserve_white); + +/** + Parse the contents of an archive entry into a tree of xml nodes. + + preserve_white: whether to keep or delete all-whitespace nodes. +*/ +fz_xml *fz_parse_xml_archive_entry(fz_context *ctx, fz_archive *arch, const char *filename, int preserve_white); + +/** + Try and parse the contents of an archive entry into a tree of xml nodes. + + preserve_white: whether to keep or delete all-whitespace nodes. + + Will return NULL if the archive entry can't be found. Otherwise behaves + the same as fz_parse_xml_archive_entry. May throw exceptions. +*/ +fz_xml *fz_try_parse_xml_archive_entry(fz_context *ctx, fz_archive *arch, const char *filename, int preserve_white); + +/** + Parse the contents of a buffer into a tree of XML nodes, + using the HTML5 parsing algorithm. +*/ +fz_xml *fz_parse_xml_from_html5(fz_context *ctx, fz_buffer *buf); + +/** + Add a reference to the XML. +*/ +fz_xml *fz_keep_xml(fz_context *ctx, fz_xml *xml); + +/** + Drop a reference to the XML. When the last reference is + dropped, the node and all its children and siblings will + be freed. +*/ +void fz_drop_xml(fz_context *ctx, fz_xml *xml); + +/** + Detach a node from the tree, unlinking it from its parent, + and setting the document root to the node. +*/ +void fz_detach_xml(fz_context *ctx, fz_xml *node); + +/** + Return the topmost XML node of a document. +*/ +fz_xml *fz_xml_root(fz_xml_doc *xml); + +/** + Return previous sibling of XML node. +*/ +fz_xml *fz_xml_prev(fz_xml *item); + +/** + Return next sibling of XML node. +*/ +fz_xml *fz_xml_next(fz_xml *item); + +/** + Return parent of XML node. +*/ +fz_xml *fz_xml_up(fz_xml *item); + +/** + Return first child of XML node. +*/ +fz_xml *fz_xml_down(fz_xml *item); + +/** + Return true if the tag name matches. +*/ +int fz_xml_is_tag(fz_xml *item, const char *name); + +/** + Return tag of XML node. Return NULL for text nodes. +*/ +char *fz_xml_tag(fz_xml *item); + +/** + Return the value of an attribute of an XML node. + NULL if the attribute doesn't exist. +*/ +char *fz_xml_att(fz_xml *item, const char *att); + +/** + Return the value of an attribute of an XML node. + If the first attribute doesn't exist, try the second. + NULL if neither attribute exists. +*/ +char *fz_xml_att_alt(fz_xml *item, const char *one, const char *two); + +/** + Check for a matching attribute on an XML node. + + If the node has the requested attribute (name), and the value + matches (match) then return 1. Otherwise, 0. +*/ +int fz_xml_att_eq(fz_xml *item, const char *name, const char *match); + +/** + Add an attribute to an XML node. +*/ +void fz_xml_add_att(fz_context *ctx, fz_pool *pool, fz_xml *node, const char *key, const char *val); + +/** + Return the text content of an XML node. + Return NULL if the node is a tag. +*/ +char *fz_xml_text(fz_xml *item); + +/** + Pretty-print an XML tree to stdout. +*/ +void fz_debug_xml(fz_xml *item, int level); + +/** + Search the siblings of XML nodes starting with item looking for + the first with the given tag. + + Return NULL if none found. +*/ +fz_xml *fz_xml_find(fz_xml *item, const char *tag); + +/** + Search the siblings of XML nodes starting with the first sibling + of item looking for the first with the given tag. + + Return NULL if none found. +*/ +fz_xml *fz_xml_find_next(fz_xml *item, const char *tag); + +/** + Search the siblings of XML nodes starting with the first child + of item looking for the first with the given tag. + + Return NULL if none found. +*/ +fz_xml *fz_xml_find_down(fz_xml *item, const char *tag); + +/** + Search the siblings of XML nodes starting with item looking for + the first with the given tag (or any tag if tag is NULL), and + with a matching attribute. + + Return NULL if none found. +*/ +fz_xml *fz_xml_find_match(fz_xml *item, const char *tag, const char *att, const char *match); + +/** + Search the siblings of XML nodes starting with the first sibling + of item looking for the first with the given tag (or any tag if tag + is NULL), and with a matching attribute. + + Return NULL if none found. +*/ +fz_xml *fz_xml_find_next_match(fz_xml *item, const char *tag, const char *att, const char *match); + +/** + Search the siblings of XML nodes starting with the first child + of item looking for the first with the given tag (or any tag if + tag is NULL), and with a matching attribute. + + Return NULL if none found. +*/ +fz_xml *fz_xml_find_down_match(fz_xml *item, const char *tag, const char *att, const char *match); + +/** + Perform a depth first search from item, returning the first + child that matches the given tag (or any tag if tag is NULL), + with the given attribute (if att is non NULL), that matches + match (if match is non NULL). +*/ +fz_xml *fz_xml_find_dfs(fz_xml *item, const char *tag, const char *att, const char *match); + +/** + Perform a depth first search from item, returning the first + child that matches the given tag (or any tag if tag is NULL), + with the given attribute (if att is non NULL), that matches + match (if match is non NULL). The search stops if it ever + reaches the top of the tree, or the declared 'top' item. +*/ +fz_xml *fz_xml_find_dfs_top(fz_xml *item, const char *tag, const char *att, const char *match, fz_xml *top); + +/** + Perform a depth first search onwards from item, returning the first + child that matches the given tag (or any tag if tag is NULL), + with the given attribute (if att is non NULL), that matches + match (if match is non NULL). +*/ +fz_xml *fz_xml_find_next_dfs(fz_xml *item, const char *tag, const char *att, const char *match); + +/** + Perform a depth first search onwards from item, returning the first + child that matches the given tag (or any tag if tag is NULL), + with the given attribute (if att is non NULL), that matches + match (if match is non NULL). The search stops if it ever reaches + the top of the tree, or the declared 'top' item. +*/ +fz_xml *fz_xml_find_next_dfs_top(fz_xml *item, const char *tag, const char *att, const char *match, fz_xml *top); + +/** + DOM-like functions for html in xml. +*/ + +/** + Return a borrowed reference for the 'body' element of + the given DOM. +*/ +fz_xml *fz_dom_body(fz_context *ctx, fz_xml *dom); + +/** + Return a borrowed reference for the document (the top + level element) of the DOM. +*/ +fz_xml *fz_dom_document_element(fz_context *ctx, fz_xml *dom); + +/** + Create an element of a given tag type for the given DOM. + + The element is not linked into the DOM yet. +*/ +fz_xml *fz_dom_create_element(fz_context *ctx, fz_xml *dom, const char *tag); + +/** + Create a text node for the given DOM. + + The element is not linked into the DOM yet. +*/ +fz_xml *fz_dom_create_text_node(fz_context *ctx, fz_xml *dom, const char *text); + +/** + Find the first element matching the requirements in a depth first traversal from elt. + + The tagname must match tag, unless tag is NULL, when all tag names are considered to match. + + If att is NULL, then all tags match. + Otherwise: + If match is NULL, then only nodes that have an att attribute match. + If match is non-NULL, then only nodes that have an att attribute that matches match match. + + Returns NULL (if no match found), or a borrowed reference to the first matching element. +*/ +fz_xml *fz_dom_find(fz_context *ctx, fz_xml *elt, const char *tag, const char *att, const char *match); + +/** + Find the next element matching the requirements. +*/ +fz_xml *fz_dom_find_next(fz_context *ctx, fz_xml *elt, const char *tag, const char *att, const char *match); + +/** + Insert an element as the last child of a parent, unlinking the + child from its current position if required. +*/ +void fz_dom_append_child(fz_context *ctx, fz_xml *parent, fz_xml *child); + +/** + Insert an element (new_elt), before another element (node), + unlinking the new_elt from its current position if required. +*/ +void fz_dom_insert_before(fz_context *ctx, fz_xml *node, fz_xml *new_elt); + +/** + Insert an element (new_elt), after another element (node), + unlinking the new_elt from its current position if required. +*/ +void fz_dom_insert_after(fz_context *ctx, fz_xml *node, fz_xml *new_elt); + +/** + Remove an element from the DOM. The element can be added back elsewhere + if required. + + No reference counting changes for the element. +*/ +void fz_dom_remove(fz_context *ctx, fz_xml *elt); + +/** + Clone an element (and its children). + + A borrowed reference to the clone is returned. The clone is not + yet linked into the DOM. +*/ +fz_xml *fz_dom_clone(fz_context *ctx, fz_xml *elt); + +/** + Return a borrowed reference to the first child of a node, + or NULL if there isn't one. +*/ +fz_xml *fz_dom_first_child(fz_context *ctx, fz_xml *elt); + +/** + Return a borrowed reference to the parent of a node, + or NULL if there isn't one. +*/ +fz_xml *fz_dom_parent(fz_context *ctx, fz_xml *elt); + +/** + Return a borrowed reference to the next sibling of a node, + or NULL if there isn't one. +*/ +fz_xml *fz_dom_next(fz_context *ctx, fz_xml *elt); + +/** + Return a borrowed reference to the previous sibling of a node, + or NULL if there isn't one. +*/ +fz_xml *fz_dom_previous(fz_context *ctx, fz_xml *elt); + +/** + Add an attribute to an element. + + Ownership of att and value remain with the caller. +*/ +void fz_dom_add_attribute(fz_context *ctx, fz_xml *elt, const char *att, const char *value); + +/** + Remove an attribute from an element. +*/ +void fz_dom_remove_attribute(fz_context *ctx, fz_xml *elt, const char *att); + +/** + Retrieve the value of a given attribute from a given element. + + Returns a borrowed pointer to the value or NULL if not found. +*/ +const char *fz_dom_attribute(fz_context *ctx, fz_xml *elt, const char *att); + +/** + Enumerate through the attributes of an element. + + Call with i=0,1,2,3... to enumerate attributes. + + On return *att and the return value will be NULL if there are not + that many attributes to read. Otherwise, *att will be filled in + with a borrowed pointer to the attribute name, and the return + value will be a borrowed pointer to the value. +*/ +const char *fz_dom_get_attribute(fz_context *ctx, fz_xml *elt, int i, const char **att); + +#endif diff --git a/include/mupdf/helpers/mu-office-lib.h b/include/mupdf/helpers/mu-office-lib.h new file mode 100644 index 0000000..4738f3d --- /dev/null +++ b/include/mupdf/helpers/mu-office-lib.h @@ -0,0 +1,666 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +/** + * Mu Office Library + * + * This helper layer provides an API for loading, and displaying + * a file. It is deliberately as identical as possible to the + * smart-office-lib.h header file in Smart Office to facilitate + * a product which can use both Smart Office and MuPDF. + */ + +#ifndef MU_OFFICE_LIB_H +#define MU_OFFICE_LIB_H + +/* + * General use + * + * This library uses threads but is not thread safe for the caller. All + * calls should be made from a single thread. Some calls allow the user + * to arrange for their own functions to be called back from the library; + * often the users functions will be called back from a different thread. If + * a call into the library is required in response to information from a + * callback, it is the responsibility of the user to arrange for those + * library calls to be made on the same thread as for other library calls. + * Calls back into the library from a callback are not permitted. + * + * There are two main modes of use. In interactive windowing systems, it + * is usual for the app developers code to be called on the main UI thread. + * It is from the UI thread that all library calls should be made. Such + * systems usually provide a way to post a call onto the UI thread. That + * is the best way to respond to a callback from the library. + * + * The other mode of use is for plain executables that might wish to + * (say) generate images of all the pages of a document. This can + * be achieved, without use of callbacks, using the few synchronous + * calls. E.g., MuOfficeDoc_getNumPages will wait for background document + * loading to complete before returning the total number of pages and + * MuOfficeRender_destroy will wait for background rendering to complete + * before returning. + */ + +#include /* For size_t */ +#include "mupdf/fitz.h" /* For fz_context/fz_document/fz_page */ + +/** Error type returned from most MuOffice functions + * + * 0 means no error + * + * non-zero values mean an error occurred. The exact value is an indication + * of what went wrong and should be included in bug reports or support + * queries. Library users should not test this value except for 0, + * non-zero and any explicitly documented values. + */ +typedef int MuError; + +/** Errors returned to MuOfficeLoadingErrorFn + * + * Other values may also be returned. + */ +typedef enum MuOfficeDocErrorType +{ + MuOfficeDocErrorType_NoError = 0, + MuOfficeDocErrorType_UnsupportedDocumentType = 1, + MuOfficeDocErrorType_EmptyDocument = 2, + MuOfficeDocErrorType_UnableToLoadDocument = 4, + MuOfficeDocErrorType_UnsupportedEncryption = 5, + MuOfficeDocErrorType_Aborted = 6, + MuOfficeDocErrorType_OutOfMemory = 7, + + /* FIXME: Additional ones that should be backported to + * smart-office-lib.h */ + MuOfficeDocErrorType_IllegalArgument = 8, + + /** A password is required to open this document. + * + * The app should provide it using MuOffice_providePassword + * or if it doesn't want to proceed call MuOfficeDoc_destroy or + * MuOfficeDoc_abortLoad. + */ + MuOfficeDocErrorType_PasswordRequest = 0x1000 +} MuOfficeDocErrorType; + +/** + *Structure holding the detail of the layout of a bitmap. b5g6r5 is assumed. + */ +typedef struct +{ + void *memptr; + int width; + int height; + int lineSkip; +} MuOfficeBitmap; + +/** + * Structure defining a point + * + * x x coord of point + * y y coord of point + */ +typedef struct +{ + float x; + float y; +} MuOfficePoint; + +/** + * Structure defining a rectangle + * + * x x coord of top left of area within the page + * y y coord of top left of area within the page + * width width of area + * height height of area + */ +typedef struct +{ + float x; + float y; + float width; + float height; +} MuOfficeBox; + +typedef enum MuOfficePointType +{ + MuOfficePointType_MoveTo, + MuOfficePointType_LineTo +} MuOfficePointType; + +typedef struct +{ + float x; + float y; + MuOfficePointType type; +} MuOfficePathPoint; + +/** + * Structure defining what area of a page should be rendered and to what + * area of the bitmap + * + * origin coordinates of the document origin within the bitmap + * renderArea the part of the bitmap to which to render + */ +typedef struct +{ + MuOfficePoint origin; + MuOfficeBox renderArea; +} MuOfficeRenderArea; + +typedef struct MuOfficeLib MuOfficeLib; +typedef struct MuOfficeDoc MuOfficeDoc; +typedef struct MuOfficePage MuOfficePage; +typedef struct MuOfficeRender MuOfficeRender; + +/** + * Allocator function used by some functions to get blocks of memory. + * + * @param cookie data pointer passed in with the allocator. + * @param size the size of the required block. + * + * @returns as for malloc. (NULL implies OutOfMemory, or size == 0). + * Otherwise a pointer to an allocated block. + */ +typedef void *(MuOfficeAllocFn)(void *cookie, + size_t size); + +/** + * Callback function monitoring document loading + * + * Also called when the document is edited, either adding or + * removing pages, with the pagesLoaded value decreasing + * in the page-removal case. + * + * @param cookie the data pointer that was originally passed + * to MuOfficeLib_loadDocument. + * @param pagesLoaded the number of pages so far discovered. + * @param complete whether loading has completed. If this flag + * is set, pagesLoaded is the actual number of + * pages in the document. + */ +typedef void (MuOfficeLoadingProgressFn)(void *cookie, + int pagesLoaded, + int complete); + +/** + * Callback function used to monitor errors in the process of loading + * a document. + * + * @param cookie the data pointer that was originally passed + * to MuOfficeLib_loadDocument. + * @param error the error being reported + */ +typedef void (MuOfficeLoadingErrorFn)( void *cookie, + MuOfficeDocErrorType error); + +/** + * Callback function used to monitor page changes + * + * @param cookie the data pointer that was originally passed + * to MuOfficeDoc_getPage. + * @param area the area that has changed. + */ +typedef void (MuOfficePageUpdateFn)( void *cookie, + const MuOfficeBox *area); + +/** + * Callback function used to monitor a background render of a + * document page. The function is called exactly once. + * + * @param cookie the data pointer that was originally passed + * to MuOfficeDoc_monitorRenderProgress. + * @param error error returned from the rendering process + */ +typedef void (MuOfficeRenderProgressFn)(void *cookie, + MuError error); + +/** + * Document types + * + * Keep in sync with smart-office-lib.h + */ +typedef enum +{ + MuOfficeDocType_PDF, + MuOfficeDocType_XPS, + MuOfficeDocType_IMG +} MuOfficeDocType; + +/** + * The possible results of a save operation + */ +typedef enum MuOfficeSaveResult +{ + MuOfficeSave_Succeeded, + MuOfficeSave_Error, + MuOfficeSave_Cancelled +} +MuOfficeSaveResult; + +/** + * Callback function used to monitor save operations. + * + * @param cookie the data pointer that was originally passed to + * MuOfficeDoc_save. + * @param result the result of the save operation + */ +typedef void (MuOfficeSaveResultFn)( void *cookie, + MuOfficeSaveResult result); + +/** + * Create a MuOfficeLib instance. + * + * @param pMu address of variable to + * receive the created instance + * + * @return error indication - 0 for success + */ +MuError MuOfficeLib_create(MuOfficeLib **pMu); + +/** + * Destroy a MuOfficeLib instance + * + * @param mu the instance to destroy + */ +void MuOfficeLib_destroy(MuOfficeLib *mu); + +/** + * Find the type of a file given its filename extension. + * + * @param path path to the file (in utf8) + * + * @return a valid MuOfficeDocType value, or MuOfficeDocType_Other + */ +MuOfficeDocType MuOfficeLib_getDocTypeFromFileExtension(const char *path); + +/** + * Return a list of filename extensions supported by Mu Office library. + * + * @return comma-delimited list of extensions, without the leading ".". + * The caller should free the returned pointer.. + */ +char * MuOfficeLib_getSupportedFileExtensions(void); + +/** + * Load a document + * + * Call will return immediately, leaving the document loading + * in the background + * + * @param mu a MuOfficeLib instance + * @param path path to the file to load (in utf8) + * @param progressFn callback for monitoring progress + * @param errorFn callback for monitoring errors + * @param cookie a pointer to pass back to the callbacks + * @param pDoc address for return of a MuOfficeDoc object + * + * @return error indication - 0 for success + * + * The progress callback may be called several times, with increasing + * values of pagesLoaded. Unless MuOfficeDoc_destroy is called, + * before loading completes, a call with "completed" set to true + * is guaranteed. + * + * Once MuOfficeDoc_destroy is called there will be no + * further callbacks. + * + * Alternatively, in a synchronous context, MuOfficeDoc_getNumPages + * can be called to wait for loading to complete and return the total + * number of pages. In this mode of use, progressFn can be NULL.  + */ +MuError MuOfficeLib_loadDocument(MuOfficeLib *mu, + const char *path, + MuOfficeLoadingProgressFn *progressFn, + MuOfficeLoadingErrorFn *errorFn, + void *cookie, + MuOfficeDoc **pDoc); + +/** + * Perform MuPDF native operations on a given MuOfficeLib + * instance. + * + * The function is called with a fz_context value that can + * be safely used (i.e. the context is cloned/dropped + * appropriately around the call). The function should signal + * errors by fz_throw-ing. + * + * @param mu the MuOfficeLib instance. + * @param fn the function to call to run the operations. + * @param arg Opaque data pointer. + * + * @return error indication - 0 for success + */ +MuError MuOfficeLib_run(MuOfficeLib *mu, void (*fn)(fz_context *ctx, void *arg), void *arg); + +/** + * Provide the password for a document + * + * This function should be called to provide a password with a document + * error if MuOfficeError_PasswordRequired is received. + * + * If a password is requested again, this means the password was incorrect. + * + * @param doc the document object + * @param password the password (UTF8 encoded) + * @return error indication - 0 for success + */ +int MuOfficeDoc_providePassword(MuOfficeDoc *doc, const char *password); + +/** + * Return the type of an open document + * + * @param doc the document object + * + * @return the document type + */ +MuOfficeDocType MuOfficeDoc_docType(MuOfficeDoc *doc); + +/** + * Return the number of pages of a document + * + * This function waits for document loading to complete before returning + * the result. It may block the calling thread for a significant period of + * time. To avoid blocking, this call should be avoided in favour of using + * the MuOfficeLib_loadDocument callbacks to monitor loading. + * + * If background loading fails, the associated error will be returned + * from this call. + * + * @param doc the document + * @param pNumPages address for return of the number of pages + * + * @return error indication - 0 for success + */ +MuError MuOfficeDoc_getNumPages(MuOfficeDoc *doc, int *pNumPages); + +/** + * Determine if the document has been modified + * + * @param doc the document + * + * @return modified flag + */ +int MuOfficeDoc_hasBeenModified(MuOfficeDoc *doc); + +/** + * Start a save operation + * + * @param doc the document + * @param path path of the file to which to save + * @param resultFn callback used to report completion + * @param cookie a pointer to pass to the callback + * + * @return error indication - 0 for success + */ +MuError MuOfficeDoc_save(MuOfficeDoc *doc, + const char *path, + MuOfficeSaveResultFn *resultFn, + void *cookie); + +/** + * Stop a document loading. The document is not destroyed, but + * no further content will be read from the file. + * + * @param doc the MuOfficeDoc object + */ +void MuOfficeDoc_abortLoad(MuOfficeDoc *doc); + +/** + * Destroy a MuOfficeDoc object. Loading of the document is shutdown + * and no further callbacks will be issued for the specified object. + * + * @param doc the MuOfficeDoc object + */ +void MuOfficeDoc_destroy(MuOfficeDoc *doc); + +/** + * Get a page of a document + * + * @param doc the document object + * @param pageNumber the number of the page to load (lying in the + * range 0 to one less than the number of pages) + * @param updateFn Function to be called back when the page updates + * @param cookie Opaque value to pass for any updates + * @param pPage Address for return of the page object + * + * @return error indication - 0 for success + */ +MuError MuOfficeDoc_getPage( MuOfficeDoc *doc, + int pageNumber, + MuOfficePageUpdateFn *updateFn, + void *cookie, + MuOfficePage **pPage); + +/** + * Perform MuPDF native operations on a given document. + * + * The function is called with fz_context and fz_document + * values that can be safely used (i.e. the context is + * cloned/dropped appropriately around the function, and + * locking is used to ensure that no other threads are + * simultaneously using the document). Functions can + * signal errors by fz_throw-ing. + * + * Due to the locking, it is best to ensure that as little + * time is taken here as possible (i.e. if you fetch some + * data and then spend a long time processing it, it is + * probably best to fetch the data using MuOfficeDoc_run + * and then process it outside). This avoids potentially + * blocking the UI. + * + * @param doc the document object. + * @param fn the function to call with fz_context/fz_document + * values. + * @param arg Opaque data pointer. + * + * @return error indication - 0 for success + */ +MuError MuOfficeDoc_run(MuOfficeDoc *doc, void (*fn)(fz_context *ctx, fz_document *doc, void *arg), void *arg); + +/** + * Destroy a page object + * + * Note this does not delete or remove the page from the document. + * It simply destroys the page object which is merely a reference + * to the page. + * + * @param page the page object + */ +void MuOfficePage_destroy(MuOfficePage *page); + +/** + * Get the size of a page in pixels + * + * This returns the size of the page in pixels. Pages can be rendered + * with a zoom factor. The returned value is the size of bitmap + * appropriate for rendering with a zoom of 1.0 and corresponds to + * 90 dpi. The returned values are not necessarily whole numbers. + * + * @param page the page object + * @param pWidth address for return of the width + * @param pHeight address for return of the height + * + * @return error indication - 0 for success + */ +MuError MuOfficePage_getSize( MuOfficePage *page, + float *pWidth, + float *pHeight); + +/** + * Return the zoom factors necessary to render at to a given + * size in pixels. (deprecated) + * + * @param page the page object + * @param width the desired width + * @param height the desired height + * @param pXZoom Address for return of zoom necessary to fit width + * @param pYZoom Address for return of zoom necessary to fit height + * + * @return error indication - 0 for success + */ +MuError MuOfficePage_calculateZoom( MuOfficePage *page, + int width, + int height, + float *pXZoom, + float *pYZoom); + +/** + * Get the size of a page in pixels for a specified zoom factor + * (deprecated) + * + * This returns the size of bitmap that should be used to display + * the entire page at the given zoom factor. A zoom of 1.0 + * corresponds to 90 dpi. + * + * @param page the page object + * @param zoom the zoom factor + * @param pWidth address for return of the width + * @param pHeight address for return of the height + * + * @return error indication - 0 for success + */ +MuError MuOfficePage_getSizeForZoom( MuOfficePage *page, + float zoom, + int *pWidth, + int *pHeight); + +/** + * Perform MuPDF native operations on a given page. + * + * The function is called with fz_context and fz_page + * values that can be safely used (i.e. the context is + * cloned/dropped appropriately around the function, and + * locking is used to ensure that no other threads are + * simultaneously using the document). Functions can + * signal errors by fz_throw-ing. + * + * Due to the locking, it is best to ensure that as little + * time is taken here as possible (i.e. if you fetch some + * data and then spend a long time processing it, it is + * probably best to fetch the data using MuOfficePage_run + * and then process it outside). This avoids potentially + * blocking the UI. + * + * @param page the page object. + * @param fn the function to call with fz_context/fz_document + * values. + * @param arg Opaque data pointer. + * + * @return error indication - 0 for success + */ +MuError MuOfficePage_run(MuOfficePage *page, void (*fn)(fz_context *ctx, fz_page *page, void *arg), void *arg); + +/** + * Schedule the rendering of an area of document page to + * an area of a bitmap. + * + * The alignment between page and bitmap is defined by specifying + * document's origin within the bitmap, possibly either positive or + * negative. A render object is returned via which the process can + * be monitored or terminated. + * + * The progress function is called exactly once per render in either + * the success or failure case. + * + * Note that, since a render object represents a running thread that + * needs access to the page, document, and library objects, it is important + * to call MuOfficeRender_destroy, not only before using or deallocating + * the bitmap, but also before calling MuOfficePage_destroy, etc.. + * + * @param page the page to render + * @param zoom the zoom factor + * @param bitmap the bitmap + * @param area area to render + * @param progressFn the progress callback function + * @param cookie a pointer to pass to the callback function + * @param pRender Address for return of the render object + * + * @return error indication - 0 for success + */ +MuError MuOfficePage_render( MuOfficePage *page, + float zoom, + const MuOfficeBitmap *bitmap, + const MuOfficeRenderArea *area, + MuOfficeRenderProgressFn *progressFn, + void *cookie, + MuOfficeRender **pRender); + +/** + * Destroy a render + * + * This call destroys a MuOfficeRender object, aborting any current + * render. + * + * This call is intended to support an app dealing with a user quickly + * flicking through document pages. A render may be scheduled but, before + * completion, be found not to be needed. In that case the bitmap will + * need to be reused, which requires any existing render to be aborted. + * The call to MuOfficeRender_destroy will cut short the render and + * allow the bitmap to be reused immediately. + * + * @note If an active render thread is destroyed, it will be aborted. + * While fast, this is not an instant operation. For maximum + * responsiveness, it is best to 'abort' as soon as you realise you + * don't need the render, and to destroy when you get the callback. + * + * @param render The render object + */ +void MuOfficeRender_destroy(MuOfficeRender *render); + +/** + * Abort a render + * + * This call aborts any rendering currently underway. The 'render + * complete' callback (if any) given when the render was created will + * still be called. If a render has completed, this call will have no + * effect. + * + * This call will not block to wait for the render thread to stop, but + * will cause it to stop as soon as it can in the background. + * + * @note It is important not to start any new render to the same bitmap + * until the callback comes in (or until waitUntilComplete returns), as + * otherwise you can have multiple renders drawing to the same bitmap + * with unpredictable consequences. + * + * @param render The render object to abort + */ +void MuOfficeRender_abort(MuOfficeRender *render); + +/** + * Wait for a render to complete. + * + * This call will not return until rendering is complete, so on return + * the bitmap will contain the page image (assuming the render didn't + * run into an error condition) and will not be used further by any + * background processing. Any error during rendering will be returned + * from this function. + * + * This call may block the calling thread for a significant period of + * time. To avoid blocking, supply a progress-monitoring callback + * function to MuOfficePage_render. + * + * @param render The render object to destroy + * @return render error condition - 0 for no error. + */ +MuError MuOfficeRender_waitUntilComplete(MuOfficeRender *render); + +#endif /* SMART_OFFICE_LIB_H */ diff --git a/include/mupdf/helpers/mu-threads.h b/include/mupdf/helpers/mu-threads.h new file mode 100644 index 0000000..556b399 --- /dev/null +++ b/include/mupdf/helpers/mu-threads.h @@ -0,0 +1,280 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_HELPERS_MU_THREADS_H +#define MUPDF_HELPERS_MU_THREADS_H + +/* + Simple threading helper library. + Includes implementations for Windows, pthreads, + and "no threads". + + The "no threads" implementation simply provides types + and stub functions so that things will build, but abort + if we try to call them. This simplifies the job for + calling functions. + + To build this library on a platform with no threading, + define DISABLE_MUTHREADS (or extend the ifdeffery below + so that it does so). + + To build this library on a platform that uses a + threading model other than windows threads or pthreads, + extend the #ifdeffery below to set MUTHREAD_IMPL_TYPE + to an unused value, and modify mu-threads.c + appropriately. +*/ + +#if !defined(DISABLE_MUTHREADS) +#ifdef _WIN32 +#define MU_THREAD_IMPL_TYPE 1 +#elif defined(HAVE_PTHREAD) +#define MU_THREAD_IMPL_TYPE 2 +#else +#define DISABLE_MUTHREADS +#endif +#endif + +/* + Types +*/ +typedef struct mu_thread mu_thread; +typedef struct mu_semaphore mu_semaphore; +typedef struct mu_mutex mu_mutex; + +/* + Semaphores + + Created with a value of 0. Triggering a semaphore + increments the value. Waiting on a semaphore reduces + the value, blocking if it would become negative. + + Never increment the value of a semaphore above 1, as + this has undefined meaning in this implementation. +*/ + +/* + Create a semaphore. + + sem: Pointer to a mu_semaphore to populate. + + Returns non-zero for error. +*/ +int mu_create_semaphore(mu_semaphore *sem); + +/* + Destroy a semaphore. + Semaphores may safely be destroyed multiple + times. Any semaphore initialised to zeros is + safe to destroy. + + Never destroy a semaphore that may be being waited + upon, as this has undefined meaning in this + implementation. + + sem: Pointer to a mu_semaphore to destroy. +*/ +void mu_destroy_semaphore(mu_semaphore *sem); + +/* + Increment the value of the + semaphore. Never blocks. + + sem: The semaphore to increment. + + Returns non-zero on error. +*/ +int mu_trigger_semaphore(mu_semaphore *sem); + +/* + Decrement the value of the + semaphore, blocking if this would involve making + the value negative. + + sem: The semaphore to decrement. + + Returns non-zero on error. +*/ +int mu_wait_semaphore(mu_semaphore *sem); + +/* + Threads +*/ + +/* + The type for the function that a thread runs. + + arg: User supplied data. +*/ +typedef void (mu_thread_fn)(void *arg); + +/* + Create a thread to run the + supplied function with the supplied argument. + + th: Pointer to mu_thread to populate with created + threads information. + + fn: The function for the thread to run. + + arg: The argument to pass to fn. +*/ +int mu_create_thread(mu_thread *th, mu_thread_fn *fn, void *arg); + +/* + Destroy a thread. This function + blocks until a thread has terminated normally, and + destroys its storage. A mu_thread may safely be destroyed + multiple times, as may any mu_thread initialised with + zeros. + + th: Pointer to mu_thread to destroy. +*/ +void mu_destroy_thread(mu_thread *th); + +/* + Mutexes + + This implementation does not specify whether + mutexes are recursive or not. +*/ + +/* + Create a mutex. + + mutex: pointer to a mu_mutex to populate. + + Returns non-zero on error. +*/ +int mu_create_mutex(mu_mutex *mutex); + +/* + Destroy a mutex. A mu_mutex may + safely be destroyed several times, as may a mu_mutex + initialised with zeros. Never destroy locked mu_mutex. + + mutex: Pointer to mu_mutex to destroy. +*/ +void mu_destroy_mutex(mu_mutex *mutex); + +/* + Lock a mutex. + + mutex: Mutex to lock. +*/ +void mu_lock_mutex(mu_mutex *mutex); + +/* + Unlock a mutex. + + mutex: Mutex to unlock. +*/ +void mu_unlock_mutex(mu_mutex *mutex); + +/* + Everything under this point is implementation specific. + Only people looking to extend the capabilities of this + helper module should need to look below here. +*/ + +#ifdef DISABLE_MUTHREADS + +/* Null implementation */ +struct mu_semaphore +{ + int dummy; +}; + +struct mu_thread +{ + int dummy; +}; + +struct mu_mutex +{ + int dummy; +}; + +#elif MU_THREAD_IMPL_TYPE == 1 + +#include + +/* Windows threads */ +struct mu_semaphore +{ + HANDLE handle; +}; + +struct mu_thread +{ + HANDLE handle; + mu_thread_fn *fn; + void *arg; +}; + +struct mu_mutex +{ + CRITICAL_SECTION mutex; +}; + +#elif MU_THREAD_IMPL_TYPE == 2 + +/* + PThreads - without working unnamed semaphores. + + Neither ios nor OSX supports unnamed semaphores. + Named semaphores are a pain to use, so we implement + our own semaphores using condition variables and + mutexes. +*/ + +#include + +struct mu_semaphore +{ + int count; + pthread_mutex_t mutex; + pthread_cond_t cond; +}; + +struct mu_thread +{ + pthread_t thread; + mu_thread_fn *fn; + void *arg; +}; + +struct mu_mutex +{ + pthread_mutex_t mutex; +}; + +/* + Add new threading implementations here, with + #elif MU_THREAD_IMPL_TYPE == 3... etc. +*/ + +#else +#error Unknown MU_THREAD_IMPL_TYPE setting +#endif + +#endif /* MUPDF_HELPERS_MU_THREADS_H */ diff --git a/include/mupdf/helpers/pkcs7-openssl.h b/include/mupdf/helpers/pkcs7-openssl.h new file mode 100644 index 0000000..50c4689 --- /dev/null +++ b/include/mupdf/helpers/pkcs7-openssl.h @@ -0,0 +1,48 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_PKCS7_OPENSSL_H +#define MUPDF_PKCS7_OPENSSL_H + +#include "mupdf/pdf/document.h" +#include "mupdf/pdf/form.h" + +/* This an example pkcs7 implementation using openssl. These are the types of functions that you + * will likely need to sign documents and check signatures within documents. In particular, to + * sign a document, you need a function that derives a pdf_pkcs7_signer object from a certificate + * stored by the operating system or within a file. */ + +/* Check a signature's digest against ranges of bytes drawn from a stream */ +pdf_signature_error pkcs7_openssl_check_digest(fz_context *ctx, fz_stream *stm, char *sig, size_t sig_len); + +/* Check a signature's certificate is trusted */ +pdf_signature_error pkcs7_openssl_check_certificate(char *sig, size_t sig_len); + +/* Obtain the distinguished name information from signature's certificate */ +pdf_pkcs7_distinguished_name *pkcs7_openssl_distinguished_name(fz_context *ctx, char *sig, size_t sig_len); + +/* Read the certificate and private key from a pfx file, holding it as an opaque structure */ +pdf_pkcs7_signer *pkcs7_openssl_read_pfx(fz_context *ctx, const char *pfile, const char *pw); + +pdf_pkcs7_verifier *pkcs7_openssl_new_verifier(fz_context *ctx); + +#endif diff --git a/include/mupdf/memento.h b/include/mupdf/memento.h new file mode 100644 index 0000000..b2d01b9 --- /dev/null +++ b/include/mupdf/memento.h @@ -0,0 +1,423 @@ +/* Copyright (C) 2009-2022 Artifex Software, Inc. + All Rights Reserved. + + This software is provided AS-IS with no warranty, either express or + implied. + + This software is distributed under license and may not be copied, + modified or distributed except as expressly authorized under the terms + of the license contained in the file COPYING in this distribution. + + Refer to licensing information at http://www.artifex.com or contact + Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, + CA 94129, USA, for further information. +*/ + +/* Memento: A library to aid debugging of memory leaks/heap corruption. + * + * Usage (with C): + * First, build your project with MEMENTO defined, and include this + * header file wherever you use malloc, realloc or free. + * This header file will use macros to point malloc, realloc and free to + * point to Memento_malloc, Memento_realloc, Memento_free. + * + * Run your program, and all mallocs/frees/reallocs should be redirected + * through here. When the program exits, you will get a list of all the + * leaked blocks, together with some helpful statistics. You can get the + * same list of allocated blocks at any point during program execution by + * calling Memento_listBlocks(); + * + * Every call to malloc/free/realloc counts as an 'allocation event'. + * On each event Memento increments a counter. Every block is tagged with + * the current counter on allocation. Every so often during program + * execution, the heap is checked for consistency. By default this happens + * after 1024 events, then after 2048 events, then after 4096 events, etc. + * This can be changed at runtime by using Memento_setParanoia(int level). + * 0 turns off such checking, 1 sets checking to happen on every event, + * any positive number n sets checking to happen once every n events, + * and any negative number n sets checking to happen after -n events, then + * after -2n events etc. + * + * The default paranoia level is therefore -1024. + * + * Memento keeps blocks around for a while after they have been freed, and + * checks them as part of these heap checks to see if they have been + * written to (or are freed twice etc). + * + * A given heap block can be checked for consistency (it's 'pre' and + * 'post' guard blocks are checked to see if they have been written to) + * by calling Memento_checkBlock(void *blockAddress); + * + * A check of all the memory can be triggered by calling Memento_check(); + * (or Memento_checkAllMemory(); if you'd like it to be quieter). + * + * A good place to breakpoint is Memento_breakpoint, as this will then + * trigger your debugger if an error is detected. This is done + * automatically for debug windows builds. + * + * If a block is found to be corrupt, information will be printed to the + * console, including the address of the block, the size of the block, + * the type of corruption, the number of the block and the event on which + * it last passed a check for correctness. + * + * If you rerun, and call Memento_paranoidAt(int event); with this number + * the code will wait until it reaches that event and then start + * checking the heap after every allocation event. Assuming it is a + * deterministic failure, you should then find out where in your program + * the error is occurring (between event x-1 and event x). + * + * Then you can rerun the program again, and call + * Memento_breakAt(int event); and the program will call + * Memento_Breakpoint() when event x is reached, enabling you to step + * through. + * + * Memento_find(address) will tell you what block (if any) the given + * address is in. + * + * An example: + * Suppose we have a gs invocation that crashes with memory corruption. + * * Build with -DMEMENTO. + * * In your debugger put a breakpoint on Memento_breakpoint. + * * Run the program. It will stop in Memento_inited. + * * Execute Memento_setParanoia(1); (In VS use Ctrl-Alt-Q). (Note #1) + * * Continue execution. + * * It will detect the memory corruption on the next allocation event + * after it happens, and stop in Memento_breakpoint. The console should + * show something like: + * + * Freed blocks: + * 0x172e610(size=288,num=1415) index 256 (0x172e710) onwards corrupted + * Block last checked OK at allocation 1457. Now 1458. + * + * * This means that the block became corrupted between allocation 1457 + * and 1458 - so if we rerun and stop the program at 1457, we can then + * step through, possibly with a data breakpoint at 0x172e710 and see + * when it occurs. + * * So restart the program from the beginning. When we stop after + * initialisation execute Memento_breakAt(1457); (and maybe + * Memento_setParanoia(1), or Memento_setParanoidAt(1457)) + * * Continue execution until we hit Memento_breakpoint. + * * Now you can step through and watch the memory corruption happen. + * + * Note #1: Using Memento_setParanoia(1) can cause your program to run + * very slowly. You may instead choose to use Memento_setParanoia(100) + * (or some other figure). This will only exhaustively check memory on + * every 100th allocation event. This trades speed for the size of the + * average allocation event range in which detection of memory corruption + * occurs. You may (for example) choose to run once checking every 100 + * allocations and discover that the corruption happens between events + * X and X+100. You can then rerun using Memento_paranoidAt(X), and + * it'll only start exhaustively checking when it reaches X. + * + * More than one memory allocator? + * + * If you have more than one memory allocator in the system (like for + * instance the ghostscript chunk allocator, that builds on top of the + * standard malloc and returns chunks itself), then there are some things + * to note: + * + * * If the secondary allocator gets its underlying blocks from calling + * malloc, then those will be checked by Memento, but 'subblocks' that + * are returned to the secondary allocator will not. There is currently + * no way to fix this other than trying to bypass the secondary + * allocator. One way I have found to do this with the chunk allocator + * is to tweak its idea of a 'large block' so that it puts every + * allocation in its own chunk. Clearly this negates the point of having + * a secondary allocator, and is therefore not recommended for general + * use. + * + * * Again, if the secondary allocator gets its underlying blocks from + * calling malloc (and hence Memento) leak detection should still work + * (but whole blocks will be detected rather than subblocks). + * + * * If on every allocation attempt the secondary allocator calls into + * Memento_failThisEvent(), and fails the allocation if it returns true + * then more useful features can be used; firstly memory squeezing will + * work, and secondly, Memento will have a "finer grained" paranoia + * available to it. + * + * Usage with C++: + * + * Memento has some experimental code in it to trap new/delete (and + * new[]/delete[] if required) calls. + * + * In all cases, Memento will provide a C API that new/delete + * operators can be built upon: + * void *Memento_cpp_new(size_t size); + * void Memento_cpp_delete(void *pointer); + * void *Memento_cpp_new_array(size_t size); + * void Memento_cpp_delete_array(void *pointer); + * + * There are various ways that actual operator definitions can be + * provided: + * + * 1) If memento.c is built with the c++ compiler, then global new + * and delete operators will be built in to memento by default. + * + * 2) If memento.c is built as normal with the C compiler, then + * no such veneers will be built in. The caller must provide them + * themselves. This can be done either by: + * + * a) Copying the lines between: + * // C++ Operator Veneers - START + * and + * // C++ Operator Veneers - END + * from memento.c into a C++ file within their own project. + * + * or + * + * b) Add the following lines to a C++ file in the project: + * #define MEMENTO_CPP_EXTRAS_ONLY + * #include "memento.c" + * + * 3) For those people that would like to be able to compile memento.c + * with a C compiler, and provide new/delete veneers globally + * within their own C++ code (so avoiding the need for memento.h to + * be included from every file), define MEMENTO_NO_CPLUSPLUS as you + * build, and Memento will not provide any veneers itself, instead + * relying on the library user to provide them. + * + * For convenience the lines to implement such veneers can be found + * at the end of memento.c between: + * // C++ Operator Veneers - START + * and + * // C++ Operator Veneers - END + * + * Memento's interception of new/delete can be disabled at runtime + * by using Memento_setIgnoreNewDelete(1). Alternatively the + * MEMENTO_IGNORENEWDELETE environment variable can be set to 1 to + * achieve the same result. + * + * Both Windows and GCC provide separate new[] and delete[] operators + * for arrays. Apparently some systems do not. If this is the case for + * your system, define MEMENTO_CPP_NO_ARRAY_CONSTRUCTORS. + * + * "libbacktrace.so failed to load" + * + * In order to give nice backtraces on unix, Memento will try to use + * a libbacktrace dynamic library. If it can't find it, you'll see + * that warning, and your backtraces won't include file/line information. + * + * To fix this you'll need to build your own libbacktrace. Don't worry + * it's really easy: + * git clone git://github.com/ianlancetaylor/libbacktrace + * cd libbacktrace + * ./configure --enable-shared + * make + * + * This leaves the build .so as .libs/libbacktrace.so + * + * Memento will look for this on LD_LIBRARY_PATH, or in /opt/lib/, + * or in /lib/, or in /usr/lib/, or in /usr/local/lib/. I recommend + * using /opt/lib/ as this won't conflict with anything that you + * get via a package manager like apt. + * + * sudo mkdir /opt + * sudo mkdir /opt/lib + * sudo cp .libs/libbacktrace.so /opt/lib/ + */ + +#ifdef __cplusplus + +// Avoids problems with strdup()'s throw() attribute on Linux. +#include + +extern "C" { +#endif + +#ifndef MEMENTO_H + +/* Include all these first, so our definitions below do + * not conflict with them. */ +#include +#include +#include + +#define MEMENTO_H + +#ifndef MEMENTO_UNDERLYING_MALLOC +#define MEMENTO_UNDERLYING_MALLOC malloc +#endif +#ifndef MEMENTO_UNDERLYING_FREE +#define MEMENTO_UNDERLYING_FREE free +#endif +#ifndef MEMENTO_UNDERLYING_REALLOC +#define MEMENTO_UNDERLYING_REALLOC realloc +#endif +#ifndef MEMENTO_UNDERLYING_CALLOC +#define MEMENTO_UNDERLYING_CALLOC calloc +#endif + +#ifndef MEMENTO_MAXALIGN +#define MEMENTO_MAXALIGN (sizeof(int)) +#endif + +#define MEMENTO_PREFILL 0xa6 +#define MEMENTO_POSTFILL 0xa7 +#define MEMENTO_ALLOCFILL 0xa8 +#define MEMENTO_FREEFILL 0xa9 + +int Memento_checkBlock(void *); +int Memento_checkAllMemory(void); +int Memento_check(void); + +int Memento_setParanoia(int); +int Memento_paranoidAt(int); +int Memento_breakAt(int); +void Memento_breakOnFree(void *a); +void Memento_breakOnRealloc(void *a); +int Memento_getBlockNum(void *); +int Memento_find(void *a); +void Memento_breakpoint(void); +int Memento_failAt(int); +int Memento_failThisEvent(void); +void Memento_listBlocks(void); +void Memento_listNewBlocks(void); +void Memento_listPhasedBlocks(void); +size_t Memento_setMax(size_t); +void Memento_stats(void); +void *Memento_label(void *, const char *); +void Memento_tick(void); +int Memento_setVerbose(int); + +/* Terminate backtraces if we see specified function name. E.g. +'cfunction_call' will exclude Python interpreter functions when Python calls C +code. Returns 0 on success, -1 on failure (out of memory). */ +int Memento_addBacktraceLimitFnname(const char *fnname); + +/* If is 0, we do not call Memento_fin() in an atexit() handler. */ +int Memento_setAtexitFin(int atexitfin); + +int Memento_setIgnoreNewDelete(int ignore); + +void *Memento_malloc(size_t s); +void *Memento_realloc(void *, size_t s); +void Memento_free(void *); +void *Memento_calloc(size_t, size_t); +char *Memento_strdup(const char*); +#if !defined(MEMENTO_GS_HACKS) && !defined(MEMENTO_MUPDF_HACKS) +int Memento_asprintf(char **ret, const char *format, ...); +int Memento_vasprintf(char **ret, const char *format, va_list ap); +#endif + +void Memento_info(void *addr); +void Memento_listBlockInfo(void); +void Memento_blockInfo(void *blk); +void *Memento_takeByteRef(void *blk); +void *Memento_dropByteRef(void *blk); +void *Memento_takeShortRef(void *blk); +void *Memento_dropShortRef(void *blk); +void *Memento_takeIntRef(void *blk); +void *Memento_dropIntRef(void *blk); +void *Memento_takeRef(void *blk); +void *Memento_dropRef(void *blk); +void *Memento_adjustRef(void *blk, int adjust); +void *Memento_reference(void *blk); + +int Memento_checkPointerOrNull(void *blk); +int Memento_checkBytePointerOrNull(void *blk); +int Memento_checkShortPointerOrNull(void *blk); +int Memento_checkIntPointerOrNull(void *blk); + +void Memento_startLeaking(void); +void Memento_stopLeaking(void); + +/* Returns number of allocation events so far. */ +int Memento_sequence(void); + +/* Returns non-zero if our process was forked by Memento squeeze. */ +int Memento_squeezing(void); + +void Memento_fin(void); + +void Memento_bt(void); + +void *Memento_cpp_new(size_t size); +void Memento_cpp_delete(void *pointer); +void *Memento_cpp_new_array(size_t size); +void Memento_cpp_delete_array(void *pointer); + +void Memento_showHash(unsigned int hash); + +#ifdef MEMENTO + +#ifndef COMPILING_MEMENTO_C +#define malloc Memento_malloc +#define free Memento_free +#define realloc Memento_realloc +#define calloc Memento_calloc +#define strdup Memento_strdup +#if !defined(MEMENTO_GS_HACKS) && !defined(MEMENTO_MUPDF_HACKS) +#define asprintf Memento_asprintf +#define vasprintf Memento_vasprintf +#endif +#endif + +#else + +#define Memento_malloc MEMENTO_UNDERLYING_MALLOC +#define Memento_free MEMENTO_UNDERLYING_FREE +#define Memento_realloc MEMENTO_UNDERLYING_REALLOC +#define Memento_calloc MEMENTO_UNDERLYING_CALLOC +#define Memento_strdup strdup +#if !defined(MEMENTO_GS_HACKS) && !defined(MEMENTO_MUPDF_HACKS) +#define Memento_asprintf asprintf +#define Memento_vasprintf vasprintf +#endif + +#define Memento_checkBlock(A) 0 +#define Memento_checkAllMemory() 0 +#define Memento_check() 0 +#define Memento_setParanoia(A) 0 +#define Memento_paranoidAt(A) 0 +#define Memento_breakAt(A) 0 +#define Memento_breakOnFree(A) 0 +#define Memento_breakOnRealloc(A) 0 +#define Memento_getBlockNum(A) 0 +#define Memento_find(A) 0 +#define Memento_breakpoint() do {} while (0) +#define Memento_failAt(A) 0 +#define Memento_failThisEvent() 0 +#define Memento_listBlocks() do {} while (0) +#define Memento_listNewBlocks() do {} while (0) +#define Memento_listPhasedBlocks() do {} while (0) +#define Memento_setMax(A) 0 +#define Memento_stats() do {} while (0) +#define Memento_label(A,B) (A) +#define Memento_info(A) do {} while (0) +#define Memento_listBlockInfo() do {} while (0) +#define Memento_blockInfo(A) do {} while (0) +#define Memento_takeByteRef(A) (A) +#define Memento_dropByteRef(A) (A) +#define Memento_takeShortRef(A) (A) +#define Memento_dropShortRef(A) (A) +#define Memento_takeIntRef(A) (A) +#define Memento_dropIntRef(A) (A) +#define Memento_takeRef(A) (A) +#define Memento_dropRef(A) (A) +#define Memento_adjustRef(A,V) (A) +#define Memento_reference(A) (A) +#define Memento_checkPointerOrNull(A) 0 +#define Memento_checkBytePointerOrNull(A) 0 +#define Memento_checkShortPointerOrNull(A) 0 +#define Memento_checkIntPointerOrNull(A) 0 +#define Memento_setIgnoreNewDelete(v) 0 + +#define Memento_tick() do {} while (0) +#define Memento_startLeaking() do {} while (0) +#define Memento_stopLeaking() do {} while (0) +#define Memento_fin() do {} while (0) +#define Memento_bt() do {} while (0) +#define Memento_sequence() (0) +#define Memento_squeezing() (0) +#define Memento_setVerbose(A) (A) +#define Memento_addBacktraceLimitFnname(A) (0) +#define Memento_setAtexitFin(atexitfin) (0) + +#endif /* MEMENTO */ + +#ifdef __cplusplus +} +#endif + +#endif /* MEMENTO_H */ diff --git a/include/mupdf/pdf.h b/include/mupdf/pdf.h new file mode 100644 index 0000000..52567d2 --- /dev/null +++ b/include/mupdf/pdf.h @@ -0,0 +1,55 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_PDF_H +#define MUPDF_PDF_H + +#include "mupdf/fitz.h" + +#ifdef __cplusplus +extern "C" { +#endif + +#include "mupdf/pdf/object.h" +#include "mupdf/pdf/document.h" +#include "mupdf/pdf/parse.h" +#include "mupdf/pdf/xref.h" +#include "mupdf/pdf/crypt.h" + +#include "mupdf/pdf/page.h" +#include "mupdf/pdf/resource.h" +#include "mupdf/pdf/cmap.h" +#include "mupdf/pdf/font.h" +#include "mupdf/pdf/interpret.h" + +#include "mupdf/pdf/annot.h" +#include "mupdf/pdf/form.h" +#include "mupdf/pdf/event.h" +#include "mupdf/pdf/javascript.h" + +#include "mupdf/pdf/clean.h" + +#ifdef __cplusplus +} +#endif + +#endif diff --git a/include/mupdf/pdf/annot.h b/include/mupdf/pdf/annot.h new file mode 100644 index 0000000..dbc7909 --- /dev/null +++ b/include/mupdf/pdf/annot.h @@ -0,0 +1,921 @@ +// Copyright (C) 2004-2023 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_PDF_ANNOT_H +#define MUPDF_PDF_ANNOT_H + +#include "mupdf/fitz/display-list.h" +#include "mupdf/fitz/stream.h" +#include "mupdf/fitz/structured-text.h" +#include "mupdf/pdf/object.h" +#include "mupdf/pdf/page.h" + +typedef struct pdf_annot pdf_annot; + +enum pdf_annot_type +{ + PDF_ANNOT_TEXT, + PDF_ANNOT_LINK, + PDF_ANNOT_FREE_TEXT, + PDF_ANNOT_LINE, + PDF_ANNOT_SQUARE, + PDF_ANNOT_CIRCLE, + PDF_ANNOT_POLYGON, + PDF_ANNOT_POLY_LINE, + PDF_ANNOT_HIGHLIGHT, + PDF_ANNOT_UNDERLINE, + PDF_ANNOT_SQUIGGLY, + PDF_ANNOT_STRIKE_OUT, + PDF_ANNOT_REDACT, + PDF_ANNOT_STAMP, + PDF_ANNOT_CARET, + PDF_ANNOT_INK, + PDF_ANNOT_POPUP, + PDF_ANNOT_FILE_ATTACHMENT, + PDF_ANNOT_SOUND, + PDF_ANNOT_MOVIE, + PDF_ANNOT_RICH_MEDIA, + PDF_ANNOT_WIDGET, + PDF_ANNOT_SCREEN, + PDF_ANNOT_PRINTER_MARK, + PDF_ANNOT_TRAP_NET, + PDF_ANNOT_WATERMARK, + PDF_ANNOT_3D, + PDF_ANNOT_PROJECTION, + PDF_ANNOT_UNKNOWN = -1 +}; + +/* + Map an annotation type to a (static) string. + + The returned string must not be freed by the caller. +*/ +const char *pdf_string_from_annot_type(fz_context *ctx, enum pdf_annot_type type); + +/* + Map from a (non-NULL, case sensitive) string to an annotation + type. +*/ +enum pdf_annot_type pdf_annot_type_from_string(fz_context *ctx, const char *subtype); + +enum +{ + PDF_ANNOT_IS_INVISIBLE = 1 << (1-1), + PDF_ANNOT_IS_HIDDEN = 1 << (2-1), + PDF_ANNOT_IS_PRINT = 1 << (3-1), + PDF_ANNOT_IS_NO_ZOOM = 1 << (4-1), + PDF_ANNOT_IS_NO_ROTATE = 1 << (5-1), + PDF_ANNOT_IS_NO_VIEW = 1 << (6-1), + PDF_ANNOT_IS_READ_ONLY = 1 << (7-1), + PDF_ANNOT_IS_LOCKED = 1 << (8-1), + PDF_ANNOT_IS_TOGGLE_NO_VIEW = 1 << (9-1), + PDF_ANNOT_IS_LOCKED_CONTENTS = 1 << (10-1) +}; + +enum pdf_line_ending +{ + PDF_ANNOT_LE_NONE = 0, + PDF_ANNOT_LE_SQUARE, + PDF_ANNOT_LE_CIRCLE, + PDF_ANNOT_LE_DIAMOND, + PDF_ANNOT_LE_OPEN_ARROW, + PDF_ANNOT_LE_CLOSED_ARROW, + PDF_ANNOT_LE_BUTT, + PDF_ANNOT_LE_R_OPEN_ARROW, + PDF_ANNOT_LE_R_CLOSED_ARROW, + PDF_ANNOT_LE_SLASH +}; + +enum +{ + PDF_ANNOT_Q_LEFT = 0, + PDF_ANNOT_Q_CENTER = 1, + PDF_ANNOT_Q_RIGHT = 2 +}; + +/* + Map from a PDF name specifying an annotation line ending + to an enumerated line ending value. +*/ +enum pdf_line_ending pdf_line_ending_from_name(fz_context *ctx, pdf_obj *end); + +/* + Map from a (non-NULL, case sensitive) C string specifying + an annotation line ending to an enumerated line ending value. +*/ +enum pdf_line_ending pdf_line_ending_from_string(fz_context *ctx, const char *end); + +/* + Map from an enumerated line ending to a pdf name object that + specifies it. +*/ +pdf_obj *pdf_name_from_line_ending(fz_context *ctx, enum pdf_line_ending end); + +/* + Map from an enumerated line ending to a C string that specifies + it. + + The caller must not free the returned string. +*/ +const char *pdf_string_from_line_ending(fz_context *ctx, enum pdf_line_ending end); + +/* + Increment the reference count for an annotation. + + Never throws exceptions. Returns the same pointer. +*/ +pdf_annot *pdf_keep_annot(fz_context *ctx, pdf_annot *annot); + +/* + Drop the reference count for an annotation. + + When the reference count reaches zero, the annotation will + be destroyed. Never throws exceptions. +*/ +void pdf_drop_annot(fz_context *ctx, pdf_annot *annot); + +/* + Returns a borrowed reference to the first annotation on + a page, or NULL if none. + + The caller should fz_keep this if it intends to hold the + pointer. Unless it fz_keeps it, it must not fz_drop it. +*/ +pdf_annot *pdf_first_annot(fz_context *ctx, pdf_page *page); + +/* + Returns a borrowed reference to the next annotation + on a page, or NULL if none. + + The caller should fz_keep this if it intends to hold the + pointer. Unless it fz_keeps it, it must not fz_drop it. +*/ +pdf_annot *pdf_next_annot(fz_context *ctx, pdf_annot *annot); + +/* + Returns a borrowed reference to the object underlying + an annotation. + + The caller should fz_keep this if it intends to hold the + pointer. Unless it fz_keeps it, it must not fz_drop it. +*/ +pdf_obj *pdf_annot_obj(fz_context *ctx, pdf_annot *annot); + +/* + Returns a borrowed reference to the page to which + an annotation belongs. + + The caller should fz_keep this if it intends to hold the + pointer. Unless it fz_keeps it, it must not fz_drop it. +*/ +pdf_page *pdf_annot_page(fz_context *ctx, pdf_annot *annot); + +/* + Return the rectangle for an annotation on a page. +*/ +fz_rect pdf_bound_annot(fz_context *ctx, pdf_annot *annot); + +enum pdf_annot_type pdf_annot_type(fz_context *ctx, pdf_annot *annot); + +/* + Interpret an annotation and render it on a device. + + page: A page loaded by pdf_load_page. + + annot: an annotation. + + dev: Device used for rendering, obtained from fz_new_*_device. + + ctm: A transformation matrix applied to the objects on the page, + e.g. to scale or rotate the page contents as desired. +*/ +void pdf_run_annot(fz_context *ctx, pdf_annot *annot, fz_device *dev, fz_matrix ctm, fz_cookie *cookie); + +/* + Lookup needle in the nametree of the document given by which. + + The returned reference is borrowed, and should not be dropped, + unless it is kept first. +*/ +pdf_obj *pdf_lookup_name(fz_context *ctx, pdf_document *doc, pdf_obj *which, pdf_obj *needle); + +/* + Load a nametree, flattening it into a single dictionary. + + The caller is responsible for pdf_dropping the returned + reference. +*/ +pdf_obj *pdf_load_name_tree(fz_context *ctx, pdf_document *doc, pdf_obj *which); + +/* + Lookup needle in the given number tree. + + The returned reference is borrowed, and should not be dropped, + unless it is kept first. +*/ +pdf_obj *pdf_lookup_number(fz_context *ctx, pdf_obj *root, int needle); + +/* + Perform a depth first traversal of a tree. + + Start at tree, looking for children in the array named + kid_name at each level. + + The arrive callback is called when we arrive at a node (i.e. + before all the children are walked), and then the leave callback + is called as we leave it (after all the children have been + walked). + + names and values are (matching) null terminated arrays of + names and values to be carried down the tree, to implement + inheritance. NULL is a permissible value. +*/ +void pdf_walk_tree(fz_context *ctx, pdf_obj *tree, pdf_obj *kid_name, + void (*arrive)(fz_context *, pdf_obj *, void *, pdf_obj **), + void (*leave)(fz_context *, pdf_obj *, void *), + void *arg, + pdf_obj **names, + pdf_obj **values); + +/* + Resolve a link within a document. +*/ +int pdf_resolve_link(fz_context *ctx, pdf_document *doc, const char *uri, float *xp, float *yp); +fz_link_dest pdf_resolve_link_dest(fz_context *ctx, pdf_document *doc, const char *uri); + +/* + Create an action object given a link URI. The action will + be a GoTo or URI action depending on whether the link URI + specifies a document internal or external destination. +*/ +pdf_obj *pdf_new_action_from_link(fz_context *ctx, pdf_document *doc, const char *uri); + +/* + Create a destination object given a link URI expected to adhere + to the Adobe specification "Parameters for Opening PDF files" + from the Adobe Acrobat SDK. The resulting destination object + will either be a PDF string, or a PDF array referring to a page + and suitable zoom level settings. In the latter case the page + can be referred to by PDF object number or by page number, this + is controlled by the is_remote argument. For remote destinations + it is not possible to refer to the page by object number, so + page numbers are used instead. +*/ +pdf_obj *pdf_new_dest_from_link(fz_context *ctx, pdf_document *doc, const char *uri, int is_remote); + +/* + Create a link URI string according to the Adobe specification + "Parameters for Opening PDF files" from the Adobe Acrobat SDK, + version 8.1, which can, at the time of writing, be found here: + + https://web.archive.org/web/20170921000830/http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/pdf_open_parameters.pdf + + The resulting string must be freed by the caller. +*/ +char *pdf_new_uri_from_explicit_dest(fz_context *ctx, fz_link_dest dest); + +/* + Create a remote link URI string according to the Adobe specification + "Parameters for Opening PDF files" from the Adobe Acrobat SDK, + version 8.1, which can, at the time of writing, be found here: + + https://web.archive.org/web/20170921000830/http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/pdf_open_parameters.pdf + + The file: URI scheme is used in the resulting URI if the remote document + is specified by a system independent path (already taking the recommendations + in table 3.40 of the PDF 1.7 specification into account), and either a + destination name or a page number and zoom level are appended: + file:///path/doc.pdf#page=42&view=FitV,100 + file:///path/doc.pdf#nameddest=G42.123456 + + If a URL is used to specify the remote document, then its scheme takes + precedence and either a destination name or a page number and zoom level + are appended: + ftp://example.com/alpha.pdf#page=42&view=Fit + https://example.com/bravo.pdf?query=parameter#page=42&view=Fit + + The resulting string must be freed by the caller. +*/ +char *pdf_append_named_dest_to_uri(fz_context *ctx, const char *url, const char *name); +char *pdf_append_explicit_dest_to_uri(fz_context *ctx, const char *url, fz_link_dest dest); +char *pdf_new_uri_from_path_and_named_dest(fz_context *ctx, const char *path, const char *name); +char *pdf_new_uri_from_path_and_explicit_dest(fz_context *ctx, const char *path, fz_link_dest dest); + +/* + Create transform to fit appearance stream to annotation Rect +*/ +fz_matrix pdf_annot_transform(fz_context *ctx, pdf_annot *annot); + +/* + Create a new link object. +*/ +fz_link *pdf_new_link(fz_context *ctx, pdf_page *page, fz_rect rect, const char *uri, pdf_obj *obj); + +/* + create a new annotation of the specified type on the + specified page. The returned pdf_annot structure is owned by the + page and does not need to be freed. +*/ +pdf_annot *pdf_create_annot_raw(fz_context *ctx, pdf_page *page, enum pdf_annot_type type); + +/* + create a new link on the specified page. The returned fz_link + structure is owned by the page and does not need to be freed. +*/ +fz_link *pdf_create_link(fz_context *ctx, pdf_page *page, fz_rect bbox, const char *uri); + +/* + delete an existing link from the specified page. +*/ +void pdf_delete_link(fz_context *ctx, pdf_page *page, fz_link *link); + +enum pdf_border_style +{ + PDF_BORDER_STYLE_SOLID = 0, + PDF_BORDER_STYLE_DASHED, + PDF_BORDER_STYLE_BEVELED, + PDF_BORDER_STYLE_INSET, + PDF_BORDER_STYLE_UNDERLINE, +}; + +enum pdf_border_effect +{ + PDF_BORDER_EFFECT_NONE = 0, + PDF_BORDER_EFFECT_CLOUDY, +}; + +/* + create a new annotation of the specified type on the + specified page. Populate it with sensible defaults per the type. + + Currently this returns a reference that the caller owns, and + must drop when finished with it. Up until release 1.18, the + returned reference was owned by the page and did not need to + be freed. +*/ +pdf_annot *pdf_create_annot(fz_context *ctx, pdf_page *page, enum pdf_annot_type type); + +/* + Delete an annoation from the page. + + This unlinks the annotation from the page structure and drops + the pages reference to it. Any reference held by the caller + will not be dropped automatically, so this can safely be used + on a borrowed reference. +*/ +void pdf_delete_annot(fz_context *ctx, pdf_page *page, pdf_annot *annot); + +/* + Edit the associated Popup annotation rectangle. + + Popup annotations are used to store the size and position of the + popup box that is used to edit the contents of the markup annotation. +*/ +void pdf_set_annot_popup(fz_context *ctx, pdf_annot *annot, fz_rect rect); +fz_rect pdf_annot_popup(fz_context *ctx, pdf_annot *annot); + +/* + Check to see if an annotation has a rect. +*/ +int pdf_annot_has_rect(fz_context *ctx, pdf_annot *annot); + +/* + Check to see if an annotation has an ink list. +*/ +int pdf_annot_has_ink_list(fz_context *ctx, pdf_annot *annot); + +/* + Check to see if an annotation has quad points data. +*/ +int pdf_annot_has_quad_points(fz_context *ctx, pdf_annot *annot); + +/* + Check to see if an annotation has vertex data. +*/ +int pdf_annot_has_vertices(fz_context *ctx, pdf_annot *annot); + +/* + Check to see if an annotation has line data. +*/ +int pdf_annot_has_line(fz_context *ctx, pdf_annot *annot); + +/* + Check to see if an annotation has an interior color. +*/ +int pdf_annot_has_interior_color(fz_context *ctx, pdf_annot *annot); + +/* + Check to see if an annotation has line ending styles. +*/ +int pdf_annot_has_line_ending_styles(fz_context *ctx, pdf_annot *annot); + +/* + Check to see if an annotation has a border. +*/ +int pdf_annot_has_border(fz_context *ctx, pdf_annot *annot); + +/* + Check to see if an annotation has a border effect. +*/ +int pdf_annot_has_border_effect(fz_context *ctx, pdf_annot *annot); + +/* + Check to see if an annotation has an icon name. +*/ +int pdf_annot_has_icon_name(fz_context *ctx, pdf_annot *annot); + +/* + Check to see if an annotation has an open action. +*/ +int pdf_annot_has_open(fz_context *ctx, pdf_annot *annot); + +/* + Check to see if an annotation has author data. +*/ +int pdf_annot_has_author(fz_context *ctx, pdf_annot *annot); + +/* + Retrieve the annotation flags. +*/ +int pdf_annot_flags(fz_context *ctx, pdf_annot *annot); + +/* + Retrieve the annotation bounds in doc space. +*/ +fz_rect pdf_annot_rect(fz_context *ctx, pdf_annot *annot); + +/* + Retrieve the annotation border line width in points. +*/ +float pdf_annot_border(fz_context *ctx, pdf_annot *annot); + +/* + Retrieve the annotation border style. + */ +enum pdf_border_style pdf_annot_border_style(fz_context *ctx, pdf_annot *annot); + +/* + Retrieve the annotation border width in points. + */ +float pdf_annot_border_width(fz_context *ctx, pdf_annot *annot); + +/* + How many items does the annotation border dash pattern have? + */ +int pdf_annot_border_dash_count(fz_context *ctx, pdf_annot *annot); + +/* + How long is dash item i in the annotation border dash pattern? + */ +float pdf_annot_border_dash_item(fz_context *ctx, pdf_annot *annot, int i); + +/* + Retrieve the annotation border effect. + */ +enum pdf_border_effect pdf_annot_border_effect(fz_context *ctx, pdf_annot *annot); + +/* + Retrieve the annotation border effect intensity. + */ +float pdf_annot_border_effect_intensity(fz_context *ctx, pdf_annot *annot); + +/* + Retrieve the annotation opacity. (0 transparent, 1 solid). +*/ +float pdf_annot_opacity(fz_context *ctx, pdf_annot *annot); + +/* + Retrieve the annotation color. + + n components, each between 0 and 1. + n = 1 (grey), 3 (rgb) or 4 (cmyk). +*/ +void pdf_annot_color(fz_context *ctx, pdf_annot *annot, int *n, float color[4]); + +/* + Retrieve the annotation interior color. + + n components, each between 0 and 1. + n = 1 (grey), 3 (rgb) or 4 (cmyk). +*/ +void pdf_annot_interior_color(fz_context *ctx, pdf_annot *annot, int *n, float color[4]); + +/* + Retrieve the annotation quadding (justification) to use. + 0 = Left-justified + 1 = Centered + 2 = Right-justified +*/ +int pdf_annot_quadding(fz_context *ctx, pdf_annot *annot); + +/* + Retrieve the annotations text language (either from the + annotation, or from the document). +*/ +fz_text_language pdf_annot_language(fz_context *ctx, pdf_annot *annot); + +/* + How many quad points does an annotation have? +*/ +int pdf_annot_quad_point_count(fz_context *ctx, pdf_annot *annot); + +/* + Get quadpoint i for an annotation. +*/ +fz_quad pdf_annot_quad_point(fz_context *ctx, pdf_annot *annot, int i); + +/* + How many strokes in the ink list for an annotation? +*/ +int pdf_annot_ink_list_count(fz_context *ctx, pdf_annot *annot); + +/* + How many vertexes in stroke i of the ink list for an annotation? +*/ +int pdf_annot_ink_list_stroke_count(fz_context *ctx, pdf_annot *annot, int i); + +/* + Get vertex k from stroke i of the ink list for an annoation, in + doc space. +*/ +fz_point pdf_annot_ink_list_stroke_vertex(fz_context *ctx, pdf_annot *annot, int i, int k); + +/* + Set the flags for an annotation. +*/ +void pdf_set_annot_flags(fz_context *ctx, pdf_annot *annot, int flags); + +/* + Set the stamp appearance stream to a custom image. + Fits the image to the current Rect, and shrinks the Rect + to fit the image aspect ratio. +*/ +void pdf_set_annot_stamp_image(fz_context *ctx, pdf_annot *annot, fz_image *image); + +/* + Set the bounding box for an annotation, in doc space. +*/ +void pdf_set_annot_rect(fz_context *ctx, pdf_annot *annot, fz_rect rect); + +/* + Set the border width for an annotation, in points and remove any border effect. +*/ +void pdf_set_annot_border(fz_context *ctx, pdf_annot *annot, float width); + +/* + Set the border style for an annotation. +*/ +void pdf_set_annot_border_style(fz_context *ctx, pdf_annot *annot, enum pdf_border_style style); + +/* + Set the border width for an annotation in points; +*/ +void pdf_set_annot_border_width(fz_context *ctx, pdf_annot *annot, float width); + +/* + Clear the entire border dash pattern for an annotation. +*/ +void pdf_clear_annot_border_dash(fz_context *ctx, pdf_annot *annot); + +/* + Add an item to the end of the border dash pattern for an annotation. +*/ +void pdf_add_annot_border_dash_item(fz_context *ctx, pdf_annot *annot, float length); + +/* + Set the border effect for an annotation. +*/ +void pdf_set_annot_border_effect(fz_context *ctx, pdf_annot *annot, enum pdf_border_effect effect); + +/* + Set the border effect intensity for an annotation. +*/ +void pdf_set_annot_border_effect_intensity(fz_context *ctx, pdf_annot *annot, float intensity); + +/* + Set the opacity for an annotation, between 0 (transparent) and 1 + (solid). +*/ +void pdf_set_annot_opacity(fz_context *ctx, pdf_annot *annot, float opacity); + +/* + Set the annotation color. + + n components, each between 0 and 1. + n = 1 (grey), 3 (rgb) or 4 (cmyk). +*/ +void pdf_set_annot_color(fz_context *ctx, pdf_annot *annot, int n, const float *color); + +/* + Set the annotation interior color. + + n components, each between 0 and 1. + n = 1 (grey), 3 (rgb) or 4 (cmyk). +*/ +void pdf_set_annot_interior_color(fz_context *ctx, pdf_annot *annot, int n, const float *color); + +/* + Set the quadding (justification) to use for the annotation. + 0 = Left-justified + 1 = Centered + 2 = Right-justified +*/ +void pdf_set_annot_quadding(fz_context *ctx, pdf_annot *annot, int q); + +/* + Set the language for the annotation. +*/ +void pdf_set_annot_language(fz_context *ctx, pdf_annot *annot, fz_text_language lang); + +/* + Set the quad points for an annotation to those in the qv array + of length n. +*/ +void pdf_set_annot_quad_points(fz_context *ctx, pdf_annot *annot, int n, const fz_quad *qv); + +/* + Clear the quadpoint data for an annotation. +*/ +void pdf_clear_annot_quad_points(fz_context *ctx, pdf_annot *annot); + +/* + Append a new quad point to the quad point data in an annotation. +*/ +void pdf_add_annot_quad_point(fz_context *ctx, pdf_annot *annot, fz_quad quad); + +/* + Set the ink list for an annotation. + + n strokes. For 0 <= i < n, stroke i has count[i] points, + The vertexes for all the strokes are packed into a single + array, pointed to by v. +*/ +void pdf_set_annot_ink_list(fz_context *ctx, pdf_annot *annot, int n, const int *count, const fz_point *v); + +/* + Clear the ink list for an annotation. +*/ +void pdf_clear_annot_ink_list(fz_context *ctx, pdf_annot *annot); + +/* + Add a new stroke (initially empty) to the ink list for an + annotation. +*/ +void pdf_add_annot_ink_list_stroke(fz_context *ctx, pdf_annot *annot); + +/* + Add a new vertex to the last stroke in the ink list for an + annotation. +*/ +void pdf_add_annot_ink_list_stroke_vertex(fz_context *ctx, pdf_annot *annot, fz_point p); + +/* + Add a new stroke to the ink list for an annotation, and + populate it with the n points from stroke[]. +*/ +void pdf_add_annot_ink_list(fz_context *ctx, pdf_annot *annot, int n, fz_point stroke[]); + +/* + +*/ +void pdf_set_annot_icon_name(fz_context *ctx, pdf_annot *annot, const char *name); +void pdf_set_annot_is_open(fz_context *ctx, pdf_annot *annot, int is_open); + +enum pdf_line_ending pdf_annot_line_start_style(fz_context *ctx, pdf_annot *annot); +enum pdf_line_ending pdf_annot_line_end_style(fz_context *ctx, pdf_annot *annot); +void pdf_annot_line_ending_styles(fz_context *ctx, pdf_annot *annot, enum pdf_line_ending *start_style, enum pdf_line_ending *end_style); +void pdf_set_annot_line_start_style(fz_context *ctx, pdf_annot *annot, enum pdf_line_ending s); +void pdf_set_annot_line_end_style(fz_context *ctx, pdf_annot *annot, enum pdf_line_ending e); +void pdf_set_annot_line_ending_styles(fz_context *ctx, pdf_annot *annot, enum pdf_line_ending start_style, enum pdf_line_ending end_style); + +const char *pdf_annot_icon_name(fz_context *ctx, pdf_annot *annot); +int pdf_annot_is_open(fz_context *ctx, pdf_annot *annot); +int pdf_annot_is_standard_stamp(fz_context *ctx, pdf_annot *annot); + +void pdf_annot_line(fz_context *ctx, pdf_annot *annot, fz_point *a, fz_point *b); +void pdf_set_annot_line(fz_context *ctx, pdf_annot *annot, fz_point a, fz_point b); + +int pdf_annot_vertex_count(fz_context *ctx, pdf_annot *annot); +fz_point pdf_annot_vertex(fz_context *ctx, pdf_annot *annot, int i); + +void pdf_set_annot_vertices(fz_context *ctx, pdf_annot *annot, int n, const fz_point *v); +void pdf_clear_annot_vertices(fz_context *ctx, pdf_annot *annot); +void pdf_add_annot_vertex(fz_context *ctx, pdf_annot *annot, fz_point p); +void pdf_set_annot_vertex(fz_context *ctx, pdf_annot *annot, int i, fz_point p); + +const char *pdf_annot_contents(fz_context *ctx, pdf_annot *annot); +void pdf_set_annot_contents(fz_context *ctx, pdf_annot *annot, const char *text); + +const char *pdf_annot_author(fz_context *ctx, pdf_annot *annot); +void pdf_set_annot_author(fz_context *ctx, pdf_annot *annot, const char *author); + +int64_t pdf_annot_modification_date(fz_context *ctx, pdf_annot *annot); +void pdf_set_annot_modification_date(fz_context *ctx, pdf_annot *annot, int64_t time); +int64_t pdf_annot_creation_date(fz_context *ctx, pdf_annot *annot); +void pdf_set_annot_creation_date(fz_context *ctx, pdf_annot *annot, int64_t time); + +void pdf_parse_default_appearance(fz_context *ctx, const char *da, const char **font, float *size, int *n, float color[4]); +void pdf_print_default_appearance(fz_context *ctx, char *buf, int nbuf, const char *font, float size, int n, const float *color); +void pdf_annot_default_appearance(fz_context *ctx, pdf_annot *annot, const char **font, float *size, int *n, float color[4]); +void pdf_set_annot_default_appearance(fz_context *ctx, pdf_annot *annot, const char *font, float size, int n, const float *color); + +void pdf_annot_request_resynthesis(fz_context *ctx, pdf_annot *annot); +int pdf_annot_needs_resynthesis(fz_context *ctx, pdf_annot *annot); +void pdf_set_annot_resynthesised(fz_context *ctx, pdf_annot *annot); +void pdf_dirty_annot(fz_context *ctx, pdf_annot *annot); + +int pdf_annot_field_flags(fz_context *ctx, pdf_annot *annot); +const char *pdf_annot_field_value(fz_context *ctx, pdf_annot *annot); +const char *pdf_annot_field_label(fz_context *ctx, pdf_annot *widget); + +int pdf_set_annot_field_value(fz_context *ctx, pdf_document *doc, pdf_annot *widget, const char *text, int ignore_trigger_events); + +/* + Recreate the appearance stream for an annotation, if necessary. +*/ +fz_text *pdf_layout_fit_text(fz_context *ctx, fz_font *font, fz_text_language lang, const char *str, fz_rect bounds); + +/* + Start/Stop using the annotation-local xref. This allows us to + generate appearance streams that don't actually hit the underlying + document. +*/ +void pdf_annot_push_local_xref(fz_context *ctx, pdf_annot *annot); +void pdf_annot_pop_local_xref(fz_context *ctx, pdf_annot *annot); +void pdf_annot_ensure_local_xref(fz_context *ctx, pdf_annot *annot); +void pdf_annot_pop_and_discard_local_xref(fz_context *ctx, pdf_annot *annot); + +/* + Regenerate any appearance streams that are out of date and check for + cases where a different appearance stream should be selected because of + state changes. + + Note that a call to pdf_pass_event for one page may lead to changes on + any other, so an app should call pdf_update_annot for every annotation + it currently displays. Also it is important that the pdf_annot object + is the one used to last render the annotation. If instead the app were + to drop the page or annotations and reload them then a call to + pdf_update_annot would not reliably be able to report all changed + annotations. + + Returns true if the annotation appearance has changed since the last time + pdf_update_annot was called or the annotation was first loaded. +*/ +int pdf_update_annot(fz_context *ctx, pdf_annot *annot); + +/* + Recalculate form fields if necessary. + + Loop through all annotations on the page and update them. Return true + if any of them were changed (by either event or javascript actions, or + by annotation editing) and need re-rendering. + + If you need more granularity, loop through the annotations and call + pdf_update_annot for each one to detect changes on a per-annotation + basis. +*/ +int pdf_update_page(fz_context *ctx, pdf_page *page); + +/* + Update internal state appropriate for editing this field. When editing + is true, updating the text of the text widget will not have any + side-effects such as changing other widgets or running javascript. + This state is intended for the period when a text widget is having + characters typed into it. The state should be reverted at the end of + the edit sequence and the text newly updated. +*/ +void pdf_set_widget_editing_state(fz_context *ctx, pdf_annot *widget, int editing); + +int pdf_get_widget_editing_state(fz_context *ctx, pdf_annot *widget); + +/* + Toggle the state of a specified annotation. Applies only to check-box + and radio-button widgets. +*/ +int pdf_toggle_widget(fz_context *ctx, pdf_annot *widget); + +fz_display_list *pdf_new_display_list_from_annot(fz_context *ctx, pdf_annot *annot); + +/* + Render an annotation suitable for blending on top of the opaque + pixmap returned by fz_new_pixmap_from_page_contents. +*/ +fz_pixmap *pdf_new_pixmap_from_annot(fz_context *ctx, pdf_annot *annot, fz_matrix ctm, fz_colorspace *cs, fz_separations *seps, int alpha); +fz_stext_page *pdf_new_stext_page_from_annot(fz_context *ctx, pdf_annot *annot, const fz_stext_options *options); + +fz_layout_block *pdf_layout_text_widget(fz_context *ctx, pdf_annot *annot); + +typedef struct pdf_embedded_file_params pdf_embedded_file_params; + +/* + Parameters for and embedded file. Obtained through + pdf_get_embedded_file_params(). The creation and + modification date fields are < 0 if unknown. +*/ +struct pdf_embedded_file_params { + const char *filename; + const char *mimetype; + int size; + int64_t created; + int64_t modified; +}; + +/* + Check if pdf object is a file specification. +*/ +int pdf_is_embedded_file(fz_context *ctx, pdf_obj *fs); + +/* + Add an embedded file to the document. This can later + be passed e.g. to pdf_annot_set_filespec(). If unknown, + supply NULL for MIME type and -1 for the date arguments. + If a checksum is added it can later be verified by calling + pdf_verify_embedded_file_checksum(). +*/ +pdf_obj *pdf_add_embedded_file(fz_context *ctx, pdf_document *doc, const char *filename, const char *mimetype, fz_buffer *contents, int64_t created, int64_t modifed, int add_checksum); + +/* + Obtain parameters for embedded file: name, size, + creation and modification dates cnad MIME type. +*/ +void pdf_get_embedded_file_params(fz_context *ctx, pdf_obj *fs, pdf_embedded_file_params *out); + +/* + Load embedded file contents in a buffer which + needs to be dropped by the called after use. +*/ +fz_buffer *pdf_load_embedded_file_contents(fz_context *ctx, pdf_obj *fs); + +/* + Verifies the embedded file checksum. Returns 1 + if the verifiction is successful or there is no + checksum to be verified, or 0 if verification fails. +*/ +int pdf_verify_embedded_file_checksum(fz_context *ctx, pdf_obj *fs); + +pdf_obj *pdf_lookup_dest(fz_context *ctx, pdf_document *doc, pdf_obj *needle); +fz_link *pdf_load_link_annots(fz_context *ctx, pdf_document *, pdf_page *, pdf_obj *annots, int pagenum, fz_matrix page_ctm); + +void pdf_annot_MK_BG(fz_context *ctx, pdf_annot *annot, int *n, float color[4]); +void pdf_annot_MK_BC(fz_context *ctx, pdf_annot *annot, int *n, float color[4]); +int pdf_annot_MK_BG_rgb(fz_context *ctx, pdf_annot *annot, float rgb[3]); +int pdf_annot_MK_BC_rgb(fz_context *ctx, pdf_annot *annot, float rgb[3]); + +pdf_obj *pdf_annot_ap(fz_context *ctx, pdf_annot *annot); + +int pdf_annot_active(fz_context *ctx, pdf_annot *annot); +void pdf_set_annot_active(fz_context *ctx, pdf_annot *annot, int active); +int pdf_annot_hot(fz_context *ctx, pdf_annot *annot); +void pdf_set_annot_hot(fz_context *ctx, pdf_annot *annot, int hot); + +void pdf_set_annot_appearance(fz_context *ctx, pdf_annot *annot, const char *appearance, const char *state, fz_matrix ctm, fz_rect bbox, pdf_obj *res, fz_buffer *contents); +void pdf_set_annot_appearance_from_display_list(fz_context *ctx, pdf_annot *annot, const char *appearance, const char *state, fz_matrix ctm, fz_display_list *list); + +/* + Check to see if an annotation has a file specification. +*/ +int pdf_annot_has_filespec(fz_context *ctx, pdf_annot *annot); + +/* + Retrieve the file specification for the given annotation. +*/ +pdf_obj *pdf_annot_filespec(fz_context *ctx, pdf_annot *annot); + +/* + Set the annotation file specification. +*/ +void pdf_set_annot_filespec(fz_context *ctx, pdf_annot *annot, pdf_obj *obj); + +/* + Get/set a hidden flag preventing the annotation from being + rendered when it is being edited. This flag is independent + of the hidden flag in the PDF annotation object described in the PDF specification. +*/ +int pdf_annot_hidden_for_editing(fz_context *ctx, pdf_annot *annot); +void pdf_set_annot_hidden_for_editing(fz_context *ctx, pdf_annot *annot, int hidden); + +/* + * Apply Redaction annotation by redacting page underneath and removing the annotation. + */ +int pdf_apply_redaction(fz_context *ctx, pdf_annot *annot, pdf_redact_options *opts); + +#endif diff --git a/include/mupdf/pdf/clean.h b/include/mupdf/pdf/clean.h new file mode 100644 index 0000000..7759637 --- /dev/null +++ b/include/mupdf/pdf/clean.h @@ -0,0 +1,33 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_PDF_CLEAN_H +#define MUPDF_PDF_CLEAN_H + +#include "mupdf/pdf/document.h" + +/* + Read infile, and write selected pages to outfile with the given options. +*/ +void pdf_clean_file(fz_context *ctx, char *infile, char *outfile, char *password, pdf_write_options *opts, int retainlen, char *retainlist[]); + +#endif diff --git a/include/mupdf/pdf/cmap.h b/include/mupdf/pdf/cmap.h new file mode 100644 index 0000000..4c1a5cc --- /dev/null +++ b/include/mupdf/pdf/cmap.h @@ -0,0 +1,143 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_PDF_CMAP_H +#define MUPDF_PDF_CMAP_H + +#include "mupdf/fitz/store.h" +#include "mupdf/pdf/document.h" + +#define PDF_MRANGE_CAP 32 + +typedef struct +{ + unsigned short low, high, out; +} pdf_range; + +typedef struct +{ + unsigned int low, high, out; +} pdf_xrange; + +typedef struct +{ + unsigned int low, out; +} pdf_mrange; + +typedef struct cmap_splay cmap_splay; + +typedef struct pdf_cmap +{ + fz_storable storable; + char cmap_name[32]; + + char usecmap_name[32]; + struct pdf_cmap *usecmap; + + int wmode; + + int codespace_len; + struct + { + int n; + unsigned int low; + unsigned int high; + } codespace[40]; + + int rlen, rcap; + pdf_range *ranges; + + int xlen, xcap; + pdf_xrange *xranges; + + int mlen, mcap; + pdf_mrange *mranges; + + int dlen, dcap; + int *dict; + + int tlen, tcap, ttop; + cmap_splay *tree; +} pdf_cmap; + +pdf_cmap *pdf_new_cmap(fz_context *ctx); +pdf_cmap *pdf_keep_cmap(fz_context *ctx, pdf_cmap *cmap); +void pdf_drop_cmap(fz_context *ctx, pdf_cmap *cmap); +void pdf_drop_cmap_imp(fz_context *ctx, fz_storable *cmap); +size_t pdf_cmap_size(fz_context *ctx, pdf_cmap *cmap); + +int pdf_cmap_wmode(fz_context *ctx, pdf_cmap *cmap); +void pdf_set_cmap_wmode(fz_context *ctx, pdf_cmap *cmap, int wmode); +void pdf_set_usecmap(fz_context *ctx, pdf_cmap *cmap, pdf_cmap *usecmap); + +/* + Add a codespacerange section. + These ranges are used by pdf_decode_cmap to decode + multi-byte encoded strings. +*/ +void pdf_add_codespace(fz_context *ctx, pdf_cmap *cmap, unsigned int low, unsigned int high, size_t n); + +/* + Add a range of contiguous one-to-one mappings (ie 1..5 maps to 21..25) +*/ +void pdf_map_range_to_range(fz_context *ctx, pdf_cmap *cmap, unsigned int srclo, unsigned int srchi, int dstlo); + +/* + Add a single one-to-many mapping. +*/ +void pdf_map_one_to_many(fz_context *ctx, pdf_cmap *cmap, unsigned int one, int *many, size_t len); +void pdf_sort_cmap(fz_context *ctx, pdf_cmap *cmap); + +/* + Lookup the mapping of a codepoint. +*/ +int pdf_lookup_cmap(pdf_cmap *cmap, unsigned int cpt); +int pdf_lookup_cmap_full(pdf_cmap *cmap, unsigned int cpt, int *out); + +/* + Use the codespace ranges to extract a codepoint from a + multi-byte encoded string. +*/ +int pdf_decode_cmap(pdf_cmap *cmap, unsigned char *s, unsigned char *e, unsigned int *cpt); + +/* + Create an Identity-* CMap (for both 1 and 2-byte encodings) +*/ +pdf_cmap *pdf_new_identity_cmap(fz_context *ctx, int wmode, int bytes); +pdf_cmap *pdf_load_cmap(fz_context *ctx, fz_stream *file); + +/* + Load predefined CMap from system. +*/ +pdf_cmap *pdf_load_system_cmap(fz_context *ctx, const char *name); + +/* + Load built-in CMap resource. +*/ +pdf_cmap *pdf_load_builtin_cmap(fz_context *ctx, const char *name); + +/* + Load CMap stream in PDF file +*/ +pdf_cmap *pdf_load_embedded_cmap(fz_context *ctx, pdf_document *doc, pdf_obj *ref); + +#endif diff --git a/include/mupdf/pdf/crypt.h b/include/mupdf/pdf/crypt.h new file mode 100644 index 0000000..5273b65 --- /dev/null +++ b/include/mupdf/pdf/crypt.h @@ -0,0 +1,107 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_PDF_CRYPT_H +#define MUPDF_PDF_CRYPT_H + +#include "mupdf/pdf/document.h" +#include "mupdf/pdf/object.h" + +enum +{ + PDF_ENCRYPT_KEEP, + PDF_ENCRYPT_NONE, + PDF_ENCRYPT_RC4_40, + PDF_ENCRYPT_RC4_128, + PDF_ENCRYPT_AES_128, + PDF_ENCRYPT_AES_256, + PDF_ENCRYPT_UNKNOWN +}; + +/* + Create crypt object for decrypting strings and streams + given the Encryption and ID objects. +*/ +pdf_crypt *pdf_new_crypt(fz_context *ctx, pdf_obj *enc, pdf_obj *id); +pdf_crypt *pdf_new_encrypt(fz_context *ctx, const char *opwd_utf8, const char *upwd_utf8, pdf_obj *id, int permissions, int algorithm); +void pdf_drop_crypt(fz_context *ctx, pdf_crypt *crypt); + +void pdf_crypt_obj(fz_context *ctx, pdf_crypt *crypt, pdf_obj *obj, int num, int gen); +fz_stream *pdf_open_crypt(fz_context *ctx, fz_stream *chain, pdf_crypt *crypt, int num, int gen); +fz_stream *pdf_open_crypt_with_filter(fz_context *ctx, fz_stream *chain, pdf_crypt *crypt, pdf_obj *name, int num, int gen); + +int pdf_crypt_version(fz_context *ctx, pdf_crypt *crypt); +int pdf_crypt_revision(fz_context *ctx, pdf_crypt *crypt); +const char *pdf_crypt_method(fz_context *ctx, pdf_crypt *crypt); +const char *pdf_crypt_string_method(fz_context *ctx, pdf_crypt *crypt); +const char *pdf_crypt_stream_method(fz_context *ctx, pdf_crypt *crypt); +int pdf_crypt_length(fz_context *ctx, pdf_crypt *crypt); +int pdf_crypt_permissions(fz_context *ctx, pdf_crypt *crypt); +int pdf_crypt_encrypt_metadata(fz_context *ctx, pdf_crypt *crypt); +unsigned char *pdf_crypt_owner_password(fz_context *ctx, pdf_crypt *crypt); +unsigned char *pdf_crypt_user_password(fz_context *ctx, pdf_crypt *crypt); +unsigned char *pdf_crypt_owner_encryption(fz_context *ctx, pdf_crypt *crypt); +unsigned char *pdf_crypt_user_encryption(fz_context *ctx, pdf_crypt *crypt); +unsigned char *pdf_crypt_permissions_encryption(fz_context *ctx, pdf_crypt *crypt); +unsigned char *pdf_crypt_key(fz_context *ctx, pdf_crypt *crypt); + +void pdf_print_crypt(fz_context *ctx, fz_output *out, pdf_crypt *crypt); + +void pdf_write_digest(fz_context *ctx, fz_output *out, pdf_obj *byte_range, pdf_obj *field, size_t digest_offset, size_t digest_length, pdf_pkcs7_signer *signer); + +/* + User access permissions from PDF reference. +*/ +enum +{ + PDF_PERM_PRINT = 1 << 2, + PDF_PERM_MODIFY = 1 << 3, + PDF_PERM_COPY = 1 << 4, + PDF_PERM_ANNOTATE = 1 << 5, + PDF_PERM_FORM = 1 << 8, + PDF_PERM_ACCESSIBILITY = 1 << 9, /* deprecated in pdf 2.0 (this permission is always granted) */ + PDF_PERM_ASSEMBLE = 1 << 10, + PDF_PERM_PRINT_HQ = 1 << 11, +}; + +int pdf_document_permissions(fz_context *ctx, pdf_document *doc); + +int pdf_signature_byte_range(fz_context *ctx, pdf_document *doc, pdf_obj *signature, fz_range *byte_range); + +/* + retrieve an fz_stream to read the bytes hashed for the signature +*/ +fz_stream *pdf_signature_hash_bytes(fz_context *ctx, pdf_document *doc, pdf_obj *signature); + +int pdf_signature_incremental_change_since_signing(fz_context *ctx, pdf_document *doc, pdf_obj *signature); + +/* + Retrieve the contents of a signature as a counted allocated + block that must be freed by the caller. +*/ +size_t pdf_signature_contents(fz_context *ctx, pdf_document *doc, pdf_obj *signature, char **contents); + +void pdf_encrypt_data(fz_context *ctx, pdf_crypt *crypt, int num, int gen, void (*fmt_str_out)(fz_context *, void *, const unsigned char *, size_t), void *arg, const unsigned char *s, size_t n); + +size_t pdf_encrypted_len(fz_context *ctx, pdf_crypt *crypt, int num, int gen, size_t len); + +#endif diff --git a/include/mupdf/pdf/document.h b/include/mupdf/pdf/document.h new file mode 100644 index 0000000..a711a70 --- /dev/null +++ b/include/mupdf/pdf/document.h @@ -0,0 +1,808 @@ +// Copyright (C) 2004-2023 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_PDF_DOCUMENT_H +#define MUPDF_PDF_DOCUMENT_H + +#include "mupdf/fitz/export.h" +#include "mupdf/fitz/document.h" +#include "mupdf/fitz/hash.h" +#include "mupdf/fitz/stream.h" +#include "mupdf/fitz/xml.h" +#include "mupdf/pdf/object.h" + +typedef struct pdf_xref pdf_xref; +typedef struct pdf_ocg_descriptor pdf_ocg_descriptor; + +typedef struct pdf_page pdf_page; +typedef struct pdf_annot pdf_annot; +typedef struct pdf_js pdf_js; +typedef struct pdf_document pdf_document; + +enum +{ + PDF_LEXBUF_SMALL = 256, + PDF_LEXBUF_LARGE = 65536 +}; + +typedef struct +{ + size_t size; + size_t base_size; + size_t len; + int64_t i; + float f; + char *scratch; + char buffer[PDF_LEXBUF_SMALL]; +} pdf_lexbuf; + +typedef struct +{ + pdf_lexbuf base; + char buffer[PDF_LEXBUF_LARGE - PDF_LEXBUF_SMALL]; +} pdf_lexbuf_large; + +/* + Document event structures are mostly opaque to the app. Only the type + is visible to the app. +*/ +typedef struct pdf_doc_event pdf_doc_event; + +/* + the type of function via which the app receives + document events. +*/ +typedef void (pdf_doc_event_cb)(fz_context *ctx, pdf_document *doc, pdf_doc_event *evt, void *data); + +/* + the type of function via which the app frees + the data provided to the event callback pdf_doc_event_cb. +*/ +typedef void (pdf_free_doc_event_data_cb)(fz_context *ctx, void *data); + +typedef struct pdf_js_console pdf_js_console; + +/* + Callback called when the console is dropped because it + is being replaced or the javascript is being disabled + by a call to pdf_disable_js(). +*/ +typedef void (pdf_js_console_drop_cb)(pdf_js_console *console, void *user); + +/* + Callback signalling that a piece of javascript is asking + the javascript console to be displayed. +*/ +typedef void (pdf_js_console_show_cb)(void *user); + +/* + Callback signalling that a piece of javascript is asking + the javascript console to be hidden. +*/ +typedef void (pdf_js_console_hide_cb)(void *user); + +/* + Callback signalling that a piece of javascript is asking + the javascript console to remove all its contents. +*/ +typedef void (pdf_js_console_clear_cb)(void *user); + +/* + Callback signalling that a piece of javascript is appending + the given message to the javascript console contents. +*/ +typedef void (pdf_js_console_write_cb)(void *user, const char *msg); + +/* + The callback functions relating to a javascript console. +*/ +typedef struct pdf_js_console { + pdf_js_console_drop_cb *drop; + pdf_js_console_show_cb *show; + pdf_js_console_hide_cb *hide; + pdf_js_console_clear_cb *clear; + pdf_js_console_write_cb *write; +} pdf_js_console; + +/* + Retrieve the currently set javascript console, or NULL + if none is set. +*/ +pdf_js_console *pdf_js_get_console(fz_context *ctx, pdf_document *doc); + +/* + Set a new javascript console. + + console: A set of callback functions informing about + what pieces of executed js is trying to do + to the js console. The caller transfers ownership of + console when calling pdf_js_set_console(). Once it and + the corresponding user pointer are no longer needed + console->drop() will be called passing both the console + and the user pointer. + + user: Opaque data that will be passed unchanged to all + js console callbacks when called. The caller ensures + that this is valid until either the js console is + replaced by calling pdf_js_set_console() again with a + new console, or pdf_disable_js() is called. In either + case the caller to ensures that the user data is freed. +*/ +void pdf_js_set_console(fz_context *ctx, pdf_document *doc, pdf_js_console *console, void *user); + +/* + Open a PDF document. + + Open a PDF document by reading its cross reference table, so + MuPDF can locate PDF objects inside the file. Upon an broken + cross reference table or other parse errors MuPDF will restart + parsing the file from the beginning to try to rebuild a + (hopefully correct) cross reference table to allow further + processing of the file. + + The returned pdf_document should be used when calling most + other PDF functions. Note that it wraps the context, so those + functions implicitly get access to the global state in + context. + + filename: a path to a file as it would be given to open(2). +*/ +pdf_document *pdf_open_document(fz_context *ctx, const char *filename); + +/* + Opens a PDF document. + + Same as pdf_open_document, but takes a stream instead of a + filename to locate the PDF document to open. Increments the + reference count of the stream. See fz_open_file, + fz_open_file_w or fz_open_fd for opening a stream, and + fz_drop_stream for closing an open stream. +*/ +pdf_document *pdf_open_document_with_stream(fz_context *ctx, fz_stream *file); + +/* + Closes and frees an opened PDF document. + + The resource store in the context associated with pdf_document + is emptied. +*/ +void pdf_drop_document(fz_context *ctx, pdf_document *doc); + +pdf_document *pdf_keep_document(fz_context *ctx, pdf_document *doc); + +/* + down-cast a fz_document to a pdf_document. + Returns NULL if underlying document is not PDF +*/ +pdf_document *pdf_specifics(fz_context *ctx, fz_document *doc); + +/* + Down-cast generic fitz objects into pdf specific variants. + Returns NULL if the objects are not from a PDF document. +*/ +pdf_document *pdf_document_from_fz_document(fz_context *ctx, fz_document *ptr); +pdf_page *pdf_page_from_fz_page(fz_context *ctx, fz_page *ptr); + +int pdf_needs_password(fz_context *ctx, pdf_document *doc); + +/* + Attempt to authenticate a + password. + + Returns 0 for failure, non-zero for success. + + In the non-zero case: + bit 0 set => no password required + bit 1 set => user password authenticated + bit 2 set => owner password authenticated +*/ +int pdf_authenticate_password(fz_context *ctx, pdf_document *doc, const char *pw); + +int pdf_has_permission(fz_context *ctx, pdf_document *doc, fz_permission p); +int pdf_lookup_metadata(fz_context *ctx, pdf_document *doc, const char *key, char *ptr, int size); + +fz_outline *pdf_load_outline(fz_context *ctx, pdf_document *doc); + +fz_outline_iterator *pdf_new_outline_iterator(fz_context *ctx, pdf_document *doc); + +void pdf_invalidate_xfa(fz_context *ctx, pdf_document *doc); + +/* + Get the number of layer configurations defined in this document. + + doc: The document in question. +*/ +int pdf_count_layer_configs(fz_context *ctx, pdf_document *doc); + +/* + Configure visibility of individual layers in this document. +*/ +int pdf_count_layers(fz_context *ctx, pdf_document *doc); +const char *pdf_layer_name(fz_context *ctx, pdf_document *doc, int layer); +int pdf_layer_is_enabled(fz_context *ctx, pdf_document *doc, int layer); +void pdf_enable_layer(fz_context *ctx, pdf_document *doc, int layer, int enabled); + +typedef struct +{ + const char *name; + const char *creator; +} pdf_layer_config; + +/* + Fetch the name (and optionally creator) of the given layer config. + + doc: The document in question. + + config_num: A value in the 0..n-1 range, where n is the + value returned from pdf_count_layer_configs. + + info: Pointer to structure to fill in. Pointers within + this structure may be set to NULL if no information is + available. +*/ +void pdf_layer_config_info(fz_context *ctx, pdf_document *doc, int config_num, pdf_layer_config *info); + +/* + Set the current configuration. + This updates the visibility of the optional content groups + within the document. + + doc: The document in question. + + config_num: A value in the 0..n-1 range, where n is the + value returned from pdf_count_layer_configs. +*/ +void pdf_select_layer_config(fz_context *ctx, pdf_document *doc, int config_num); + +/* + Returns the number of entries in the 'UI' for this layer configuration. + + doc: The document in question. +*/ +int pdf_count_layer_config_ui(fz_context *ctx, pdf_document *doc); + +/* + Select a checkbox/radiobox within the 'UI' for this layer + configuration. + + Selecting a UI entry that is a radiobox may disable + other UI entries. + + doc: The document in question. + + ui: A value in the 0..m-1 range, where m is the value + returned by pdf_count_layer_config_ui. +*/ +void pdf_select_layer_config_ui(fz_context *ctx, pdf_document *doc, int ui); + +/* + Select a checkbox/radiobox within the 'UI' for this layer configuration. + + doc: The document in question. + + ui: A value in the 0..m-1 range, where m is the value + returned by pdf_count_layer_config_ui. +*/ +void pdf_deselect_layer_config_ui(fz_context *ctx, pdf_document *doc, int ui); + +/* + Toggle a checkbox/radiobox within the 'UI' for this layer configuration. + + Toggling a UI entry that is a radiobox may disable + other UI entries. + + doc: The document in question. + + ui: A value in the 0..m-1 range, where m is the value + returned by pdf_count_layer_config_ui. +*/ +void pdf_toggle_layer_config_ui(fz_context *ctx, pdf_document *doc, int ui); + +typedef enum +{ + PDF_LAYER_UI_LABEL = 0, + PDF_LAYER_UI_CHECKBOX = 1, + PDF_LAYER_UI_RADIOBOX = 2 +} pdf_layer_config_ui_type; + +typedef struct +{ + const char *text; + int depth; + pdf_layer_config_ui_type type; + int selected; + int locked; +} pdf_layer_config_ui; + +/* + Get the info for a given entry in the layer config ui. + + doc: The document in question. + + ui: A value in the 0..m-1 range, where m is the value + returned by pdf_count_layer_config_ui. + + info: Pointer to a structure to fill in with information + about the requested ui entry. +*/ +void pdf_layer_config_ui_info(fz_context *ctx, pdf_document *doc, int ui, pdf_layer_config_ui *info); + +/* + Write the current layer config back into the document as the default state. +*/ +void pdf_set_layer_config_as_default(fz_context *ctx, pdf_document *doc); + +/* + Determine whether changes have been made since the + document was opened or last saved. +*/ +int pdf_has_unsaved_changes(fz_context *ctx, pdf_document *doc); + +/* + Determine if this PDF has been repaired since opening. +*/ +int pdf_was_repaired(fz_context *ctx, pdf_document *doc); + +/* Object that can perform the cryptographic operation necessary for document signing */ +typedef struct pdf_pkcs7_signer pdf_pkcs7_signer; + +/* Unsaved signature fields */ +typedef struct pdf_unsaved_sig +{ + pdf_obj *field; + size_t byte_range_start; + size_t byte_range_end; + size_t contents_start; + size_t contents_end; + pdf_pkcs7_signer *signer; + struct pdf_unsaved_sig *next; +} pdf_unsaved_sig; + +typedef struct +{ + int page; + int object; +} pdf_rev_page_map; + +typedef struct +{ + int number; /* Page object number */ + int64_t offset; /* Offset of page object */ + int64_t index; /* Index into shared hint_shared_ref */ +} pdf_hint_page; + +typedef struct +{ + int number; /* Object number of first object */ + int64_t offset; /* Offset of first object */ +} pdf_hint_shared; + +struct pdf_document +{ + fz_document super; + + fz_stream *file; + + int version; + int64_t startxref; + int64_t file_size; + pdf_crypt *crypt; + pdf_ocg_descriptor *ocg; + fz_colorspace *oi; + + int max_xref_len; + int num_xref_sections; + int saved_num_xref_sections; + int num_incremental_sections; + int xref_base; + int disallow_new_increments; + + /* The local_xref is only active, if local_xref_nesting >= 0 */ + pdf_xref *local_xref; + int local_xref_nesting; + + pdf_xref *xref_sections; + pdf_xref *saved_xref_sections; + int *xref_index; + int save_in_progress; + int last_xref_was_old_style; + int has_linearization_object; + + int map_page_count; + pdf_rev_page_map *rev_page_map; + pdf_obj **fwd_page_map; + int page_tree_broken; + + int repair_attempted; + int repair_in_progress; + int non_structural_change; /* True if we are modifying the document in a way that does not change the (page) structure */ + + /* State indicating which file parsing method we are using */ + int file_reading_linearly; + int64_t file_length; + + int linear_page_count; + pdf_obj *linear_obj; /* Linearized object (if used) */ + pdf_obj **linear_page_refs; /* Page objects for linear loading */ + int linear_page1_obj_num; + + /* The state for the pdf_progressive_advance parser */ + int64_t linear_pos; + int linear_page_num; + + int hint_object_offset; + int hint_object_length; + int hints_loaded; /* Set to 1 after the hints loading has completed, + * whether successful or not! */ + /* Page n references shared object references: + * hint_shared_ref[i] + * where + * i = s to e-1 + * s = hint_page[n]->index + * e = hint_page[n+1]->index + * Shared object reference r accesses objects: + * rs to re-1 + * where + * rs = hint_shared[r]->number + * re = hint_shared[r]->count + rs + * These are guaranteed to lie within the region starting at + * hint_shared[r]->offset of length hint_shared[r]->length + */ + pdf_hint_page *hint_page; + int *hint_shared_ref; + pdf_hint_shared *hint_shared; + int hint_obj_offsets_max; + int64_t *hint_obj_offsets; + + int resources_localised; + + pdf_lexbuf_large lexbuf; + + pdf_js *js; + + int recalculate; + int redacted; + int resynth_required; + + pdf_doc_event_cb *event_cb; + pdf_free_doc_event_data_cb *free_event_data_cb; + void *event_cb_data; + + int num_type3_fonts; + int max_type3_fonts; + fz_font **type3_fonts; + + struct { + fz_hash_table *fonts; + } resources; + + int orphans_max; + int orphans_count; + pdf_obj **orphans; + + fz_xml_doc *xfa; + + pdf_journal *journal; +}; + +pdf_document *pdf_create_document(fz_context *ctx); + +typedef struct pdf_graft_map pdf_graft_map; + +/* + Return a deep copied object equivalent to the + supplied object, suitable for use within the given document. + + dst: The document in which the returned object is to be used. + + obj: The object deep copy. + + Note: If grafting multiple objects, you should use a pdf_graft_map + to avoid potential duplication of target objects. +*/ +pdf_obj *pdf_graft_object(fz_context *ctx, pdf_document *dst, pdf_obj *obj); + +/* + Prepare a graft map object to allow objects + to be deep copied from one document to the given one, avoiding + problems with duplicated child objects. + + dst: The document to copy objects to. + + Note: all the source objects must come from the same document. +*/ +pdf_graft_map *pdf_new_graft_map(fz_context *ctx, pdf_document *dst); + +pdf_graft_map *pdf_keep_graft_map(fz_context *ctx, pdf_graft_map *map); +void pdf_drop_graft_map(fz_context *ctx, pdf_graft_map *map); + +/* + Return a deep copied object equivalent + to the supplied object, suitable for use within the target + document of the map. + + map: A map targeted at the document in which the returned + object is to be used. + + obj: The object to be copied. + + Note: Copying multiple objects via the same graft map ensures + that any shared children are not copied more than once. +*/ +pdf_obj *pdf_graft_mapped_object(fz_context *ctx, pdf_graft_map *map, pdf_obj *obj); + +/* + Graft a page (and its resources) from the src document to the + destination document of the graft. This involves a deep copy + of the objects in question. + + map: A map targetted at the document into which the page should + be inserted. + + page_to: The position within the destination document at which + the page should be inserted (pages numbered from 0, with -1 + meaning "at the end"). + + src: The document from which the page should be copied. + + page_from: The page number which should be copied from the src + document (pages numbered from 0, with -1 meaning "at the end"). +*/ +void pdf_graft_page(fz_context *ctx, pdf_document *dst, int page_to, pdf_document *src, int page_from); +void pdf_graft_mapped_page(fz_context *ctx, pdf_graft_map *map, int page_to, pdf_document *src, int page_from); + +/* + Create a device that will record the + graphical operations given to it into a sequence of + pdf operations, together with a set of resources. This + sequence/set pair can then be used as the basis for + adding a page to the document (see pdf_add_page). + Returns a kept reference. + + doc: The document for which these are intended. + + mediabox: The bbox for the created page. + + presources: Pointer to a place to put the created + resources dictionary. + + pcontents: Pointer to a place to put the created + contents buffer. +*/ +fz_device *pdf_page_write(fz_context *ctx, pdf_document *doc, fz_rect mediabox, pdf_obj **presources, fz_buffer **pcontents); + +/* + Create a pdf device. Rendering to the device creates + new pdf content. WARNING: this device is work in progress. It doesn't + currently support all rendering cases. + + Note that contents must be a stream (dictionary) to be updated (or + a reference to a stream). Callers should take care to ensure that it + is not an array, and that is it not shared with other objects/pages. +*/ +fz_device *pdf_new_pdf_device(fz_context *ctx, pdf_document *doc, fz_matrix topctm, pdf_obj *resources, fz_buffer *contents); + +/* + Create a pdf_obj within a document that + represents a page, from a previously created resources + dictionary and page content stream. This should then be + inserted into the document using pdf_insert_page. + + After this call the page exists within the document + structure, but is not actually ever displayed as it is + not linked into the PDF page tree. + + doc: The document to which to add the page. + + mediabox: The mediabox for the page (should be identical + to that used when creating the resources/contents). + + rotate: 0, 90, 180 or 270. The rotation to use for the + page. + + resources: The resources dictionary for the new page + (typically created by pdf_page_write). + + contents: The page contents for the new page (typically + create by pdf_page_write). +*/ +pdf_obj *pdf_add_page(fz_context *ctx, pdf_document *doc, fz_rect mediabox, int rotate, pdf_obj *resources, fz_buffer *contents); + +/* + Insert a page previously created by + pdf_add_page into the pages tree of the document. + + doc: The document to insert into. + + at: The page number to insert at (pages numbered from 0). + 0 <= n <= page_count inserts before page n. Negative numbers + or INT_MAX are treated as page count, and insert at the end. + 0 inserts at the start. All existing pages are after the + insertion point are shuffled up. + + page: The page to insert. +*/ +void pdf_insert_page(fz_context *ctx, pdf_document *doc, int at, pdf_obj *page); + +/* + Delete a page from the page tree of + a document. This does not remove the page contents + or resources from the file. + + doc: The document to operate on. + + number: The page to remove (numbered from 0) +*/ +void pdf_delete_page(fz_context *ctx, pdf_document *doc, int number); + +/* + Delete a range of pages from the + page tree of a document. This does not remove the page + contents or resources from the file. + + doc: The document to operate on. + + start, end: The range of pages (numbered from 0) + (inclusive, exclusive) to remove. If end is negative or + greater than the number of pages in the document, it + will be taken to be the end of the document. +*/ +void pdf_delete_page_range(fz_context *ctx, pdf_document *doc, int start, int end); + +/* + Get page label (string) from a page number (index). +*/ +void pdf_page_label(fz_context *ctx, pdf_document *doc, int page, char *buf, size_t size); +void pdf_page_label_imp(fz_context *ctx, fz_document *doc, int chapter, int page, char *buf, size_t size); + +typedef enum { + PDF_PAGE_LABEL_NONE = 0, + PDF_PAGE_LABEL_DECIMAL = 'D', + PDF_PAGE_LABEL_ROMAN_UC = 'R', + PDF_PAGE_LABEL_ROMAN_LC = 'r', + PDF_PAGE_LABEL_ALPHA_UC = 'A', + PDF_PAGE_LABEL_ALPHA_LC = 'a', +} pdf_page_label_style; + +void pdf_set_page_labels(fz_context *ctx, pdf_document *doc, int index, pdf_page_label_style style, const char *prefix, int start); +void pdf_delete_page_labels(fz_context *ctx, pdf_document *doc, int index); + +fz_text_language pdf_document_language(fz_context *ctx, pdf_document *doc); +void pdf_set_document_language(fz_context *ctx, pdf_document *doc, fz_text_language lang); + +/* + In calls to fz_save_document, the following options structure can be used + to control aspects of the writing process. This structure may grow + in the future, and should be zero-filled to allow forwards compatibility. +*/ +typedef struct +{ + int do_incremental; /* Write just the changed objects. */ + int do_pretty; /* Pretty-print dictionaries and arrays. */ + int do_ascii; /* ASCII hex encode binary streams. */ + int do_compress; /* Compress streams. */ + int do_compress_images; /* Compress (or leave compressed) image streams. */ + int do_compress_fonts; /* Compress (or leave compressed) font streams. */ + int do_decompress; /* Decompress streams (except when compressing images/fonts). */ + int do_garbage; /* Garbage collect objects before saving; 1=gc, 2=re-number, 3=de-duplicate. */ + int do_linear; /* Write linearised. */ + int do_clean; /* Clean content streams. */ + int do_sanitize; /* Sanitize content streams. */ + int do_appearance; /* (Re)create appearance streams. */ + int do_encrypt; /* Encryption method to use: keep, none, rc4-40, etc. */ + int dont_regenerate_id; /* Don't regenerate ID if set (used for clean) */ + int permissions; /* Document encryption permissions. */ + char opwd_utf8[128]; /* Owner password. */ + char upwd_utf8[128]; /* User password. */ + int do_snapshot; /* Do not use directly. Use the snapshot functions. */ + int do_preserve_metadata; /* When cleaning, preserve metadata unchanged. */ +} pdf_write_options; + +FZ_DATA extern const pdf_write_options pdf_default_write_options; + +/* + Parse option string into a pdf_write_options struct. + Matches the command line options to 'mutool clean': + g: garbage collect + d, i, f: expand all, fonts, images + l: linearize + a: ascii hex encode + z: deflate + c: clean content streams + s: sanitize content streams +*/ +pdf_write_options *pdf_parse_write_options(fz_context *ctx, pdf_write_options *opts, const char *args); + +/* + Returns true if there are digital signatures waiting to + to updated on save. +*/ +int pdf_has_unsaved_sigs(fz_context *ctx, pdf_document *doc); + +/* + Write out the document to an output stream with all changes finalised. +*/ +void pdf_write_document(fz_context *ctx, pdf_document *doc, fz_output *out, const pdf_write_options *opts); + +/* + Write out the document to a file with all changes finalised. +*/ +void pdf_save_document(fz_context *ctx, pdf_document *doc, const char *filename, const pdf_write_options *opts); + +/* + Snapshot the document to a file. This does not cause the + incremental xref to be finalized, so the document in memory + remains (essentially) unchanged. +*/ +void pdf_save_snapshot(fz_context *ctx, pdf_document *doc, const char *filename); + +/* + Snapshot the document to an output stream. This does not cause + the incremental xref to be finalized, so the document in memory + remains (essentially) unchanged. +*/ +void pdf_write_snapshot(fz_context *ctx, pdf_document *doc, fz_output *out); + +char *pdf_format_write_options(fz_context *ctx, char *buffer, size_t buffer_len, const pdf_write_options *opts); + +/* + Return true if the document can be saved incrementally. Applying + redactions or having a repaired document make incremental saving + impossible. +*/ +int pdf_can_be_saved_incrementally(fz_context *ctx, pdf_document *doc); + +/* + Write out the journal to an output stream. +*/ +void pdf_write_journal(fz_context *ctx, pdf_document *doc, fz_output *out); + +/* + Write out the journal to a file. +*/ +void pdf_save_journal(fz_context *ctx, pdf_document *doc, const char *filename); + +/* + Read a journal from a filename. Will do nothing if the journal + does not match. Will throw on a corrupted journal. +*/ +void pdf_load_journal(fz_context *ctx, pdf_document *doc, const char *filename); + +/* + Read a journal from a stream. Will do nothing if the journal + does not match. Will throw on a corrupted journal. +*/ +void pdf_read_journal(fz_context *ctx, pdf_document *doc, fz_stream *stm); + +/* + Minimize the memory used by a document. + + We walk the in memory xref tables, evicting the PDF objects + therein that aren't in use. + + This reduces the current memory use, but any subsequent use + of these objects will load them back into memory again. +*/ +void pdf_minimize_document(fz_context *ctx, pdf_document *doc); + + +#endif diff --git a/include/mupdf/pdf/event.h b/include/mupdf/pdf/event.h new file mode 100644 index 0000000..5f138ee --- /dev/null +++ b/include/mupdf/pdf/event.h @@ -0,0 +1,167 @@ +// Copyright (C) 2004-2022 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_PDF_EVENT_H +#define MUPDF_PDF_EVENT_H + +#include "mupdf/pdf/document.h" + +/* + Document events: the objects via which MuPDF informs the calling app + of occurrences emanating from the document, possibly from user interaction + or javascript execution. MuPDF informs the app of document events via a + callback. +*/ + +struct pdf_doc_event +{ + int type; +}; + +enum +{ + PDF_DOCUMENT_EVENT_ALERT, + PDF_DOCUMENT_EVENT_PRINT, + PDF_DOCUMENT_EVENT_LAUNCH_URL, + PDF_DOCUMENT_EVENT_MAIL_DOC, + PDF_DOCUMENT_EVENT_SUBMIT, + PDF_DOCUMENT_EVENT_EXEC_MENU_ITEM, +}; + +/* + set the function via which to receive + document events. +*/ +void pdf_set_doc_event_callback(fz_context *ctx, pdf_document *doc, pdf_doc_event_cb *event_cb, pdf_free_doc_event_data_cb *free_event_data_cb, void *data); +void *pdf_get_doc_event_callback_data(fz_context *ctx, pdf_document *doc); + +/* + The various types of document events +*/ + +/* + details of an alert event. In response the app should + display an alert dialog with the buttons specified by "button_type_group". + If "check_box_message" is non-NULL, a checkbox should be displayed in + the lower-left corned along with the message. + + "finally_checked" and "button_pressed" should be set by the app + before returning from the callback. "finally_checked" need be set + only if "check_box_message" is non-NULL. +*/ +typedef struct +{ + pdf_document *doc; + const char *message; + int icon_type; + int button_group_type; + const char *title; + int has_check_box; + const char *check_box_message; + int initially_checked; + int finally_checked; + int button_pressed; +} pdf_alert_event; + +/* Possible values of icon_type */ +enum +{ + PDF_ALERT_ICON_ERROR, + PDF_ALERT_ICON_WARNING, + PDF_ALERT_ICON_QUESTION, + PDF_ALERT_ICON_STATUS +}; + +/* Possible values of button_group_type */ +enum +{ + PDF_ALERT_BUTTON_GROUP_OK, + PDF_ALERT_BUTTON_GROUP_OK_CANCEL, + PDF_ALERT_BUTTON_GROUP_YES_NO, + PDF_ALERT_BUTTON_GROUP_YES_NO_CANCEL +}; + +/* Possible values of button_pressed */ +enum +{ + PDF_ALERT_BUTTON_NONE, + PDF_ALERT_BUTTON_OK, + PDF_ALERT_BUTTON_CANCEL, + PDF_ALERT_BUTTON_NO, + PDF_ALERT_BUTTON_YES +}; + +/* + access the details of an alert event + The returned pointer and all the data referred to by the + structure are owned by mupdf and need not be freed by the + caller. +*/ +pdf_alert_event *pdf_access_alert_event(fz_context *ctx, pdf_doc_event *evt); + +/* + access the details of am execMenuItem + event, which consists of just the name of the menu item +*/ +const char *pdf_access_exec_menu_item_event(fz_context *ctx, pdf_doc_event *evt); + +/* + details of a launch-url event. The app should + open the url, either in a new frame or in the current window. +*/ +typedef struct +{ + const char *url; + int new_frame; +} pdf_launch_url_event; + +/* + access the details of a launch-url + event. The returned pointer and all data referred to by the structure + are owned by mupdf and need not be freed by the caller. +*/ +pdf_launch_url_event *pdf_access_launch_url_event(fz_context *ctx, pdf_doc_event *evt); + +/* + details of a mail_doc event. The app should save + the current state of the document and email it using the specified + parameters. +*/ +typedef struct +{ + int ask_user; + const char *to; + const char *cc; + const char *bcc; + const char *subject; + const char *message; +} pdf_mail_doc_event; + +pdf_mail_doc_event *pdf_access_mail_doc_event(fz_context *ctx, pdf_doc_event *evt); + +void pdf_event_issue_alert(fz_context *ctx, pdf_document *doc, pdf_alert_event *evt); +void pdf_event_issue_print(fz_context *ctx, pdf_document *doc); +void pdf_event_issue_exec_menu_item(fz_context *ctx, pdf_document *doc, const char *item); +void pdf_event_issue_launch_url(fz_context *ctx, pdf_document *doc, const char *url, int new_frame); +void pdf_event_issue_mail_doc(fz_context *ctx, pdf_document *doc, pdf_mail_doc_event *evt); + +#endif diff --git a/include/mupdf/pdf/font.h b/include/mupdf/pdf/font.h new file mode 100644 index 0000000..79f9d40 --- /dev/null +++ b/include/mupdf/pdf/font.h @@ -0,0 +1,160 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_PDF_FONT_H +#define MUPDF_PDF_FONT_H + +#include "mupdf/pdf/cmap.h" +#include "mupdf/fitz/device.h" +#include "mupdf/fitz/font.h" + +enum +{ + PDF_FD_FIXED_PITCH = 1 << 0, + PDF_FD_SERIF = 1 << 1, + PDF_FD_SYMBOLIC = 1 << 2, + PDF_FD_SCRIPT = 1 << 3, + PDF_FD_NONSYMBOLIC = 1 << 5, + PDF_FD_ITALIC = 1 << 6, + PDF_FD_ALL_CAP = 1 << 16, + PDF_FD_SMALL_CAP = 1 << 17, + PDF_FD_FORCE_BOLD = 1 << 18 +}; + +void pdf_load_encoding(const char **estrings, const char *encoding); + +typedef struct +{ + unsigned short lo; + unsigned short hi; + int w; /* type3 fonts can be big! */ +} pdf_hmtx; + +typedef struct +{ + unsigned short lo; + unsigned short hi; + short x; + short y; + short w; +} pdf_vmtx; + +typedef struct +{ + fz_storable storable; + size_t size; + + fz_font *font; + + /* FontDescriptor */ + int flags; + float italic_angle; + float ascent; + float descent; + float cap_height; + float x_height; + float missing_width; + + /* Encoding (CMap) */ + pdf_cmap *encoding; + pdf_cmap *to_ttf_cmap; + size_t cid_to_gid_len; + unsigned short *cid_to_gid; + + /* ToUnicode */ + pdf_cmap *to_unicode; + size_t cid_to_ucs_len; + unsigned short *cid_to_ucs; + + /* Metrics (given in the PDF file) */ + int wmode; + + int hmtx_len, hmtx_cap; + pdf_hmtx dhmtx; + pdf_hmtx *hmtx; + + int vmtx_len, vmtx_cap; + pdf_vmtx dvmtx; + pdf_vmtx *vmtx; + + int is_embedded; + int t3loading; +} pdf_font_desc; + +void pdf_set_font_wmode(fz_context *ctx, pdf_font_desc *font, int wmode); +void pdf_set_default_hmtx(fz_context *ctx, pdf_font_desc *font, int w); +void pdf_set_default_vmtx(fz_context *ctx, pdf_font_desc *font, int y, int w); +void pdf_add_hmtx(fz_context *ctx, pdf_font_desc *font, int lo, int hi, int w); +void pdf_add_vmtx(fz_context *ctx, pdf_font_desc *font, int lo, int hi, int x, int y, int w); +void pdf_end_hmtx(fz_context *ctx, pdf_font_desc *font); +void pdf_end_vmtx(fz_context *ctx, pdf_font_desc *font); +pdf_hmtx pdf_lookup_hmtx(fz_context *ctx, pdf_font_desc *font, int cid); +pdf_vmtx pdf_lookup_vmtx(fz_context *ctx, pdf_font_desc *font, int cid); + +void pdf_load_to_unicode(fz_context *ctx, pdf_document *doc, pdf_font_desc *font, const char **strings, char *collection, pdf_obj *cmapstm); + +int pdf_font_cid_to_gid(fz_context *ctx, pdf_font_desc *fontdesc, int cid); +const char *pdf_clean_font_name(const char *fontname); + +const unsigned char *pdf_lookup_substitute_font(fz_context *ctx, int mono, int serif, int bold, int italic, int *len); + +pdf_font_desc *pdf_load_type3_font(fz_context *ctx, pdf_document *doc, pdf_obj *rdb, pdf_obj *obj); +void pdf_load_type3_glyphs(fz_context *ctx, pdf_document *doc, pdf_font_desc *fontdesc); +pdf_font_desc *pdf_load_font(fz_context *ctx, pdf_document *doc, pdf_obj *rdb, pdf_obj *obj); +pdf_font_desc *pdf_load_hail_mary_font(fz_context *ctx, pdf_document *doc); + +pdf_font_desc *pdf_new_font_desc(fz_context *ctx); +pdf_font_desc *pdf_keep_font(fz_context *ctx, pdf_font_desc *fontdesc); +void pdf_drop_font(fz_context *ctx, pdf_font_desc *font); + +void pdf_print_font(fz_context *ctx, fz_output *out, pdf_font_desc *fontdesc); + +void pdf_run_glyph(fz_context *ctx, pdf_document *doc, pdf_obj *resources, fz_buffer *contents, fz_device *dev, fz_matrix ctm, void *gstate, fz_default_colorspaces *default_cs); + +pdf_obj *pdf_add_simple_font(fz_context *ctx, pdf_document *doc, fz_font *font, int encoding); + +/* + Creates CID font with Identity-H CMap and a ToUnicode CMap that + is created by using the TTF cmap table "backwards" to go from + the GID to a Unicode value. + + We can possibly get width information that may have been embedded + in the PDF /W array (or W2 if vertical text) +*/ +pdf_obj *pdf_add_cid_font(fz_context *ctx, pdf_document *doc, fz_font *font); + +/* + Add a non-embedded UTF16-encoded CID-font for the CJK scripts: + CNS1, GB1, Japan1, or Korea1 +*/ +pdf_obj *pdf_add_cjk_font(fz_context *ctx, pdf_document *doc, fz_font *font, int script, int wmode, int serif); + +/* + Add a substitute font for any script. +*/ +pdf_obj *pdf_add_substitute_font(fz_context *ctx, pdf_document *doc, fz_font *font); + +int pdf_font_writing_supported(fz_font *font); + +fz_buffer *fz_extract_ttf_from_ttc(fz_context *ctx, fz_font *font); + +#endif diff --git a/include/mupdf/pdf/form.h b/include/mupdf/pdf/form.h new file mode 100644 index 0000000..6bf30c7 --- /dev/null +++ b/include/mupdf/pdf/form.h @@ -0,0 +1,382 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_PDF_FORM_H +#define MUPDF_PDF_FORM_H + +#include "mupdf/fitz/display-list.h" +#include "mupdf/pdf/document.h" + +/* Types of widget */ +enum pdf_widget_type +{ + PDF_WIDGET_TYPE_UNKNOWN, + PDF_WIDGET_TYPE_BUTTON, + PDF_WIDGET_TYPE_CHECKBOX, + PDF_WIDGET_TYPE_COMBOBOX, + PDF_WIDGET_TYPE_LISTBOX, + PDF_WIDGET_TYPE_RADIOBUTTON, + PDF_WIDGET_TYPE_SIGNATURE, + PDF_WIDGET_TYPE_TEXT, +}; + +/* Types of text widget content */ +enum pdf_widget_tx_format +{ + PDF_WIDGET_TX_FORMAT_NONE, + PDF_WIDGET_TX_FORMAT_NUMBER, + PDF_WIDGET_TX_FORMAT_SPECIAL, + PDF_WIDGET_TX_FORMAT_DATE, + PDF_WIDGET_TX_FORMAT_TIME +}; + +pdf_annot *pdf_keep_widget(fz_context *ctx, pdf_annot *widget); +void pdf_drop_widget(fz_context *ctx, pdf_annot *widget); +pdf_annot *pdf_first_widget(fz_context *ctx, pdf_page *page); +pdf_annot *pdf_next_widget(fz_context *ctx, pdf_annot *previous); +int pdf_update_widget(fz_context *ctx, pdf_annot *widget); + +/* + create a new signature widget on the specified page, with the + specified name. + + The returns pdf_annot reference must be dropped by the caller. + This is a change from releases up to an including 1.18, where + the returned reference was owned by the page and did not need + to be freed by the caller. +*/ +pdf_annot *pdf_create_signature_widget(fz_context *ctx, pdf_page *page, char *name); + +enum pdf_widget_type pdf_widget_type(fz_context *ctx, pdf_annot *widget); + +fz_rect pdf_bound_widget(fz_context *ctx, pdf_annot *widget); + +/* + get the maximum number of + characters permitted in a text widget +*/ +int pdf_text_widget_max_len(fz_context *ctx, pdf_annot *tw); + +/* + get the type of content + required by a text widget +*/ +int pdf_text_widget_format(fz_context *ctx, pdf_annot *tw); + +/* + get the list of options for a list box or combo box. + + Returns the number of options and fills in their + names within the supplied array. Should first be called with a + NULL array to find out how big the array should be. If exportval + is true, then the export values will be returned and not the list + values if there are export values present. +*/ +int pdf_choice_widget_options(fz_context *ctx, pdf_annot *tw, int exportval, const char *opts[]); +int pdf_choice_widget_is_multiselect(fz_context *ctx, pdf_annot *tw); + +/* + get the value of a choice widget. + + Returns the number of options currently selected and fills in + the supplied array with their strings. Should first be called + with NULL as the array to find out how big the array need to + be. The filled in elements should not be freed by the caller. +*/ +int pdf_choice_widget_value(fz_context *ctx, pdf_annot *tw, const char *opts[]); + +/* + set the value of a choice widget. + + The caller should pass the number of options selected and an + array of their names +*/ +void pdf_choice_widget_set_value(fz_context *ctx, pdf_annot *tw, int n, const char *opts[]); + +int pdf_choice_field_option_count(fz_context *ctx, pdf_obj *field); +const char *pdf_choice_field_option(fz_context *ctx, pdf_obj *field, int exportval, int i); + +int pdf_widget_is_signed(fz_context *ctx, pdf_annot *widget); +int pdf_widget_is_readonly(fz_context *ctx, pdf_annot *widget); + +/* Field flags */ +enum +{ + /* All fields */ + PDF_FIELD_IS_READ_ONLY = 1, + PDF_FIELD_IS_REQUIRED = 1 << 1, + PDF_FIELD_IS_NO_EXPORT = 1 << 2, + + /* Text fields */ + PDF_TX_FIELD_IS_MULTILINE = 1 << 12, + PDF_TX_FIELD_IS_PASSWORD = 1 << 13, + PDF_TX_FIELD_IS_FILE_SELECT = 1 << 20, + PDF_TX_FIELD_IS_DO_NOT_SPELL_CHECK = 1 << 22, + PDF_TX_FIELD_IS_DO_NOT_SCROLL = 1 << 23, + PDF_TX_FIELD_IS_COMB = 1 << 24, + PDF_TX_FIELD_IS_RICH_TEXT = 1 << 25, + + /* Button fields */ + PDF_BTN_FIELD_IS_NO_TOGGLE_TO_OFF = 1 << 14, + PDF_BTN_FIELD_IS_RADIO = 1 << 15, + PDF_BTN_FIELD_IS_PUSHBUTTON = 1 << 16, + PDF_BTN_FIELD_IS_RADIOS_IN_UNISON = 1 << 25, + + /* Choice fields */ + PDF_CH_FIELD_IS_COMBO = 1 << 17, + PDF_CH_FIELD_IS_EDIT = 1 << 18, + PDF_CH_FIELD_IS_SORT = 1 << 19, + PDF_CH_FIELD_IS_MULTI_SELECT = 1 << 21, + PDF_CH_FIELD_IS_DO_NOT_SPELL_CHECK = 1 << 22, + PDF_CH_FIELD_IS_COMMIT_ON_SEL_CHANGE = 1 << 25, +}; + +void pdf_calculate_form(fz_context *ctx, pdf_document *doc); +void pdf_reset_form(fz_context *ctx, pdf_document *doc, pdf_obj *fields, int exclude); + +int pdf_field_type(fz_context *ctx, pdf_obj *field); +const char *pdf_field_type_string(fz_context *ctx, pdf_obj *field); +int pdf_field_flags(fz_context *ctx, pdf_obj *field); + +/* + Retrieve the name for a field as a C string that + must be freed by the caller. +*/ +char *pdf_load_field_name(fz_context *ctx, pdf_obj *field); +const char *pdf_field_value(fz_context *ctx, pdf_obj *field); +void pdf_create_field_name(fz_context *ctx, pdf_document *doc, const char *prefix, char *buf, size_t len); + +char *pdf_field_border_style(fz_context *ctx, pdf_obj *field); +void pdf_field_set_border_style(fz_context *ctx, pdf_obj *field, const char *text); +void pdf_field_set_button_caption(fz_context *ctx, pdf_obj *field, const char *text); +void pdf_field_set_fill_color(fz_context *ctx, pdf_obj *field, pdf_obj *col); +void pdf_field_set_text_color(fz_context *ctx, pdf_obj *field, pdf_obj *col); +int pdf_field_display(fz_context *ctx, pdf_obj *field); +void pdf_field_set_display(fz_context *ctx, pdf_obj *field, int d); +const char *pdf_field_label(fz_context *ctx, pdf_obj *field); +pdf_obj *pdf_button_field_on_state(fz_context *ctx, pdf_obj *field); + +int pdf_set_field_value(fz_context *ctx, pdf_document *doc, pdf_obj *field, const char *text, int ignore_trigger_events); + +/* + Update the text of a text widget. + + The text is first validated by the Field/Keystroke event processing and accepted only if it passes. + + The function returns whether validation passed. +*/ +int pdf_set_text_field_value(fz_context *ctx, pdf_annot *widget, const char *value); +int pdf_set_choice_field_value(fz_context *ctx, pdf_annot *widget, const char *value); +int pdf_edit_text_field_value(fz_context *ctx, pdf_annot *widget, const char *value, const char *change, int *selStart, int *selEnd, char **newvalue); + +typedef struct +{ + char *cn; + char *o; + char *ou; + char *email; + char *c; +} +pdf_pkcs7_distinguished_name; + +typedef enum +{ + PDF_SIGNATURE_ERROR_OKAY, + PDF_SIGNATURE_ERROR_NO_SIGNATURES, + PDF_SIGNATURE_ERROR_NO_CERTIFICATE, + PDF_SIGNATURE_ERROR_DIGEST_FAILURE, + PDF_SIGNATURE_ERROR_SELF_SIGNED, + PDF_SIGNATURE_ERROR_SELF_SIGNED_IN_CHAIN, + PDF_SIGNATURE_ERROR_NOT_TRUSTED, + PDF_SIGNATURE_ERROR_UNKNOWN +} pdf_signature_error; + +/* Increment the reference count for a signer object */ +typedef pdf_pkcs7_signer *(pdf_pkcs7_keep_signer_fn)(fz_context *ctx, pdf_pkcs7_signer *signer); + +/* Drop a reference for a signer object */ +typedef void (pdf_pkcs7_drop_signer_fn)(fz_context *ctx, pdf_pkcs7_signer *signer); + +/* Obtain the distinguished name information from a signer object */ +typedef pdf_pkcs7_distinguished_name *(pdf_pkcs7_get_signing_name_fn)(fz_context *ctx, pdf_pkcs7_signer *signer); + +/* Predict the size of the digest. The actual digest returned by create_digest will be no greater in size */ +typedef size_t (pdf_pkcs7_max_digest_size_fn)(fz_context *ctx, pdf_pkcs7_signer *signer); + +/* Create a signature based on ranges of bytes drawn from a stream */ +typedef int (pdf_pkcs7_create_digest_fn)(fz_context *ctx, pdf_pkcs7_signer *signer, fz_stream *in, unsigned char *digest, size_t digest_len); + +struct pdf_pkcs7_signer +{ + pdf_pkcs7_keep_signer_fn *keep; + pdf_pkcs7_drop_signer_fn *drop; + pdf_pkcs7_get_signing_name_fn *get_signing_name; + pdf_pkcs7_max_digest_size_fn *max_digest_size; + pdf_pkcs7_create_digest_fn *create_digest; +}; + +typedef struct pdf_pkcs7_verifier pdf_pkcs7_verifier; + +typedef void (pdf_pkcs7_drop_verifier_fn)(fz_context *ctx, pdf_pkcs7_verifier *verifier); +typedef pdf_signature_error (pdf_pkcs7_check_certificate_fn)(fz_context *ctx, pdf_pkcs7_verifier *verifier, unsigned char *signature, size_t len); +typedef pdf_signature_error (pdf_pkcs7_check_digest_fn)(fz_context *ctx, pdf_pkcs7_verifier *verifier, fz_stream *in, unsigned char *signature, size_t len); +typedef pdf_pkcs7_distinguished_name *(pdf_pkcs7_get_signatory_fn)(fz_context *ctx, pdf_pkcs7_verifier *verifier, unsigned char *signature, size_t len); + +struct pdf_pkcs7_verifier +{ + pdf_pkcs7_drop_verifier_fn *drop; + pdf_pkcs7_check_certificate_fn *check_certificate; + pdf_pkcs7_check_digest_fn *check_digest; + pdf_pkcs7_get_signatory_fn *get_signatory; +}; + +int pdf_signature_is_signed(fz_context *ctx, pdf_document *doc, pdf_obj *field); +void pdf_signature_set_value(fz_context *ctx, pdf_document *doc, pdf_obj *field, pdf_pkcs7_signer *signer, int64_t stime); + +int pdf_count_signatures(fz_context *ctx, pdf_document *doc); + +char *pdf_signature_error_description(pdf_signature_error err); + +pdf_pkcs7_distinguished_name *pdf_signature_get_signatory(fz_context *ctx, pdf_pkcs7_verifier *verifier, pdf_document *doc, pdf_obj *signature); +pdf_pkcs7_distinguished_name *pdf_signature_get_widget_signatory(fz_context *ctx, pdf_pkcs7_verifier *verifier, pdf_annot *widget); +void pdf_signature_drop_distinguished_name(fz_context *ctx, pdf_pkcs7_distinguished_name *name); +char *pdf_signature_format_distinguished_name(fz_context *ctx, pdf_pkcs7_distinguished_name *name); +char *pdf_signature_info(fz_context *ctx, const char *name, pdf_pkcs7_distinguished_name *dn, const char *reason, const char *location, int64_t date, int include_labels); +fz_display_list *pdf_signature_appearance_signed(fz_context *ctx, fz_rect rect, fz_text_language lang, fz_image *img, const char *left_text, const char *right_text, int include_logo); +fz_display_list *pdf_signature_appearance_unsigned(fz_context *ctx, fz_rect rect, fz_text_language lang); + +pdf_signature_error pdf_check_digest(fz_context *ctx, pdf_pkcs7_verifier *verifier, pdf_document *doc, pdf_obj *signature); +pdf_signature_error pdf_check_certificate(fz_context *ctx, pdf_pkcs7_verifier *verifier, pdf_document *doc, pdf_obj *signature); +pdf_signature_error pdf_check_widget_digest(fz_context *ctx, pdf_pkcs7_verifier *verifier, pdf_annot *widget); +pdf_signature_error pdf_check_widget_certificate(fz_context *ctx, pdf_pkcs7_verifier *verifier, pdf_annot *widget); + +void pdf_clear_signature(fz_context *ctx, pdf_annot *widget); + +/* + Sign a signature field, while assigning it an arbitrary apparance determined by a display list. + The function pdf_signature_appearance can generate a variety of common signature appearances. +*/ +void pdf_sign_signature_with_appearance(fz_context *ctx, pdf_annot *widget, pdf_pkcs7_signer *signer, int64_t date, fz_display_list *disp_list); + +enum { + PDF_SIGNATURE_SHOW_LABELS = 1, + PDF_SIGNATURE_SHOW_DN = 2, + PDF_SIGNATURE_SHOW_DATE = 4, + PDF_SIGNATURE_SHOW_TEXT_NAME = 8, + PDF_SIGNATURE_SHOW_GRAPHIC_NAME = 16, + PDF_SIGNATURE_SHOW_LOGO = 32, +}; + +#define PDF_SIGNATURE_DEFAULT_APPEARANCE ( \ + PDF_SIGNATURE_SHOW_LABELS | \ + PDF_SIGNATURE_SHOW_DN | \ + PDF_SIGNATURE_SHOW_DATE | \ + PDF_SIGNATURE_SHOW_TEXT_NAME | \ + PDF_SIGNATURE_SHOW_GRAPHIC_NAME | \ + PDF_SIGNATURE_SHOW_LOGO ) + +/* + Sign a signature field, while assigning it a default appearance. +*/ +void pdf_sign_signature(fz_context *ctx, pdf_annot *widget, + pdf_pkcs7_signer *signer, + int appearance_flags, + fz_image *graphic, + const char *reason, + const char *location); + +/* + Create a preview of the default signature appearance. +*/ +fz_display_list *pdf_preview_signature_as_display_list(fz_context *ctx, + float w, float h, fz_text_language lang, + pdf_pkcs7_signer *signer, + int appearance_flags, + fz_image *graphic, + const char *reason, + const char *location); + +fz_pixmap *pdf_preview_signature_as_pixmap(fz_context *ctx, + int w, int h, fz_text_language lang, + pdf_pkcs7_signer *signer, + int appearance_flags, + fz_image *graphic, + const char *reason, + const char *location); + +/* + check a signature's certificate chain and digest + + This is a helper function defined to provide compatibility with older + versions of mupdf +*/ +int pdf_check_signature(fz_context *ctx, pdf_pkcs7_verifier *verifier, pdf_document *doc, pdf_obj *signature, char *ebuf, size_t ebufsize); + +void pdf_drop_signer(fz_context *ctx, pdf_pkcs7_signer *signer); +void pdf_drop_verifier(fz_context *ctx, pdf_pkcs7_verifier *verifier); + +void pdf_field_reset(fz_context *ctx, pdf_document *doc, pdf_obj *field); + +pdf_obj *pdf_lookup_field(fz_context *ctx, pdf_obj *form, const char *name); + +/* Form text field editing events: */ + +typedef struct +{ + const char *value; + const char *change; + int selStart, selEnd; + int willCommit; + char *newChange; + char *newValue; +} pdf_keystroke_event; + +int pdf_field_event_keystroke(fz_context *ctx, pdf_document *doc, pdf_obj *field, pdf_keystroke_event *evt); +int pdf_field_event_validate(fz_context *ctx, pdf_document *doc, pdf_obj *field, const char *value, char **newvalue); +void pdf_field_event_calculate(fz_context *ctx, pdf_document *doc, pdf_obj *field); +char *pdf_field_event_format(fz_context *ctx, pdf_document *doc, pdf_obj *field); + +int pdf_annot_field_event_keystroke(fz_context *ctx, pdf_document *doc, pdf_annot *annot, pdf_keystroke_event *evt); + +/* Call these to trigger actions from various UI events: */ + +void pdf_document_event_will_close(fz_context *ctx, pdf_document *doc); +void pdf_document_event_will_save(fz_context *ctx, pdf_document *doc); +void pdf_document_event_did_save(fz_context *ctx, pdf_document *doc); +void pdf_document_event_will_print(fz_context *ctx, pdf_document *doc); +void pdf_document_event_did_print(fz_context *ctx, pdf_document *doc); + +void pdf_page_event_open(fz_context *ctx, pdf_page *page); +void pdf_page_event_close(fz_context *ctx, pdf_page *page); + +void pdf_annot_event_enter(fz_context *ctx, pdf_annot *annot); +void pdf_annot_event_exit(fz_context *ctx, pdf_annot *annot); +void pdf_annot_event_down(fz_context *ctx, pdf_annot *annot); +void pdf_annot_event_up(fz_context *ctx, pdf_annot *annot); +void pdf_annot_event_focus(fz_context *ctx, pdf_annot *annot); +void pdf_annot_event_blur(fz_context *ctx, pdf_annot *annot); +void pdf_annot_event_page_open(fz_context *ctx, pdf_annot *annot); +void pdf_annot_event_page_close(fz_context *ctx, pdf_annot *annot); +void pdf_annot_event_page_visible(fz_context *ctx, pdf_annot *annot); +void pdf_annot_event_page_invisible(fz_context *ctx, pdf_annot *annot); + +#endif diff --git a/include/mupdf/pdf/interpret.h b/include/mupdf/pdf/interpret.h new file mode 100644 index 0000000..144d8ea --- /dev/null +++ b/include/mupdf/pdf/interpret.h @@ -0,0 +1,452 @@ +// Copyright (C) 2004-2023 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef PDF_INTERPRET_H +#define PDF_INTERPRET_H + +#include "mupdf/pdf/font.h" +#include "mupdf/pdf/resource.h" +#include "mupdf/pdf/document.h" + +typedef struct pdf_gstate pdf_gstate; +typedef struct pdf_processor pdf_processor; + +void *pdf_new_processor(fz_context *ctx, int size); +pdf_processor *pdf_keep_processor(fz_context *ctx, pdf_processor *proc); +void pdf_close_processor(fz_context *ctx, pdf_processor *proc); +void pdf_drop_processor(fz_context *ctx, pdf_processor *proc); + +struct pdf_processor +{ + int refs; + + /* close the processor. Also closes any chained processors. */ + void (*close_processor)(fz_context *ctx, pdf_processor *proc); + void (*drop_processor)(fz_context *ctx, pdf_processor *proc); + + /* At any stage, we can have one set of resources in place. + * This function gives us a set of resources to use. We remember + * any previous set on a stack, so we can pop back to it later. + * Our responsibility (as well as remembering it for our own use) + * is to pass either it, or a filtered version of it onto any + * chained processor. */ + void (*push_resources)(fz_context *ctx, pdf_processor *proc, pdf_obj *res); + /* Pop the resources stack. This must be passed on to any chained + * processors. This returns a pointer to the resource dict just + * popped by the deepest filter. The caller inherits this reference. */ + pdf_obj *(*pop_resources)(fz_context *ctx, pdf_processor *proc); + + /* general graphics state */ + void (*op_w)(fz_context *ctx, pdf_processor *proc, float linewidth); + void (*op_j)(fz_context *ctx, pdf_processor *proc, int linejoin); + void (*op_J)(fz_context *ctx, pdf_processor *proc, int linecap); + void (*op_M)(fz_context *ctx, pdf_processor *proc, float miterlimit); + void (*op_d)(fz_context *ctx, pdf_processor *proc, pdf_obj *array, float phase); + void (*op_ri)(fz_context *ctx, pdf_processor *proc, const char *intent); + void (*op_i)(fz_context *ctx, pdf_processor *proc, float flatness); + + void (*op_gs_begin)(fz_context *ctx, pdf_processor *proc, const char *name, pdf_obj *extgstate); + void (*op_gs_BM)(fz_context *ctx, pdf_processor *proc, const char *blendmode); + void (*op_gs_ca)(fz_context *ctx, pdf_processor *proc, float alpha); + void (*op_gs_CA)(fz_context *ctx, pdf_processor *proc, float alpha); + void (*op_gs_SMask)(fz_context *ctx, pdf_processor *proc, pdf_obj *smask, float *bc, int luminosity); + void (*op_gs_end)(fz_context *ctx, pdf_processor *proc); + + /* special graphics state */ + void (*op_q)(fz_context *ctx, pdf_processor *proc); + void (*op_Q)(fz_context *ctx, pdf_processor *proc); + void (*op_cm)(fz_context *ctx, pdf_processor *proc, float a, float b, float c, float d, float e, float f); + + /* path construction */ + void (*op_m)(fz_context *ctx, pdf_processor *proc, float x, float y); + void (*op_l)(fz_context *ctx, pdf_processor *proc, float x, float y); + void (*op_c)(fz_context *ctx, pdf_processor *proc, float x1, float y1, float x2, float y2, float x3, float y3); + void (*op_v)(fz_context *ctx, pdf_processor *proc, float x2, float y2, float x3, float y3); + void (*op_y)(fz_context *ctx, pdf_processor *proc, float x1, float y1, float x3, float y3); + void (*op_h)(fz_context *ctx, pdf_processor *proc); + void (*op_re)(fz_context *ctx, pdf_processor *proc, float x, float y, float w, float h); + + /* path painting */ + void (*op_S)(fz_context *ctx, pdf_processor *proc); + void (*op_s)(fz_context *ctx, pdf_processor *proc); + void (*op_F)(fz_context *ctx, pdf_processor *proc); + void (*op_f)(fz_context *ctx, pdf_processor *proc); + void (*op_fstar)(fz_context *ctx, pdf_processor *proc); + void (*op_B)(fz_context *ctx, pdf_processor *proc); + void (*op_Bstar)(fz_context *ctx, pdf_processor *proc); + void (*op_b)(fz_context *ctx, pdf_processor *proc); + void (*op_bstar)(fz_context *ctx, pdf_processor *proc); + void (*op_n)(fz_context *ctx, pdf_processor *proc); + + /* clipping paths */ + void (*op_W)(fz_context *ctx, pdf_processor *proc); + void (*op_Wstar)(fz_context *ctx, pdf_processor *proc); + + /* text objects */ + void (*op_BT)(fz_context *ctx, pdf_processor *proc); + void (*op_ET)(fz_context *ctx, pdf_processor *proc); + + /* text state */ + void (*op_Tc)(fz_context *ctx, pdf_processor *proc, float charspace); + void (*op_Tw)(fz_context *ctx, pdf_processor *proc, float wordspace); + void (*op_Tz)(fz_context *ctx, pdf_processor *proc, float scale); + void (*op_TL)(fz_context *ctx, pdf_processor *proc, float leading); + void (*op_Tf)(fz_context *ctx, pdf_processor *proc, const char *name, pdf_font_desc *font, float size); + void (*op_Tr)(fz_context *ctx, pdf_processor *proc, int render); + void (*op_Ts)(fz_context *ctx, pdf_processor *proc, float rise); + + /* text positioning */ + void (*op_Td)(fz_context *ctx, pdf_processor *proc, float tx, float ty); + void (*op_TD)(fz_context *ctx, pdf_processor *proc, float tx, float ty); + void (*op_Tm)(fz_context *ctx, pdf_processor *proc, float a, float b, float c, float d, float e, float f); + void (*op_Tstar)(fz_context *ctx, pdf_processor *proc); + + /* text showing */ + void (*op_TJ)(fz_context *ctx, pdf_processor *proc, pdf_obj *array); + void (*op_Tj)(fz_context *ctx, pdf_processor *proc, char *str, size_t len); + void (*op_squote)(fz_context *ctx, pdf_processor *proc, char *str, size_t len); + void (*op_dquote)(fz_context *ctx, pdf_processor *proc, float aw, float ac, char *str, size_t len); + + /* type 3 fonts */ + void (*op_d0)(fz_context *ctx, pdf_processor *proc, float wx, float wy); + void (*op_d1)(fz_context *ctx, pdf_processor *proc, float wx, float wy, float llx, float lly, float urx, float ury); + + /* color */ + void (*op_CS)(fz_context *ctx, pdf_processor *proc, const char *name, fz_colorspace *cs); + void (*op_cs)(fz_context *ctx, pdf_processor *proc, const char *name, fz_colorspace *cs); + void (*op_SC_pattern)(fz_context *ctx, pdf_processor *proc, const char *name, pdf_pattern *pat, int n, float *color); + void (*op_sc_pattern)(fz_context *ctx, pdf_processor *proc, const char *name, pdf_pattern *pat, int n, float *color); + void (*op_SC_shade)(fz_context *ctx, pdf_processor *proc, const char *name, fz_shade *shade); + void (*op_sc_shade)(fz_context *ctx, pdf_processor *proc, const char *name, fz_shade *shade); + void (*op_SC_color)(fz_context *ctx, pdf_processor *proc, int n, float *color); + void (*op_sc_color)(fz_context *ctx, pdf_processor *proc, int n, float *color); + + void (*op_G)(fz_context *ctx, pdf_processor *proc, float g); + void (*op_g)(fz_context *ctx, pdf_processor *proc, float g); + void (*op_RG)(fz_context *ctx, pdf_processor *proc, float r, float g, float b); + void (*op_rg)(fz_context *ctx, pdf_processor *proc, float r, float g, float b); + void (*op_K)(fz_context *ctx, pdf_processor *proc, float c, float m, float y, float k); + void (*op_k)(fz_context *ctx, pdf_processor *proc, float c, float m, float y, float k); + + /* shadings, images, xobjects */ + void (*op_BI)(fz_context *ctx, pdf_processor *proc, fz_image *image, const char *colorspace_name); + void (*op_sh)(fz_context *ctx, pdf_processor *proc, const char *name, fz_shade *shade); + void (*op_Do_image)(fz_context *ctx, pdf_processor *proc, const char *name, fz_image *image); + void (*op_Do_form)(fz_context *ctx, pdf_processor *proc, const char *name, pdf_obj *form); + + /* marked content */ + void (*op_MP)(fz_context *ctx, pdf_processor *proc, const char *tag); + void (*op_DP)(fz_context *ctx, pdf_processor *proc, const char *tag, pdf_obj *raw, pdf_obj *cooked); + void (*op_BMC)(fz_context *ctx, pdf_processor *proc, const char *tag); + void (*op_BDC)(fz_context *ctx, pdf_processor *proc, const char *tag, pdf_obj *raw, pdf_obj *cooked); + void (*op_EMC)(fz_context *ctx, pdf_processor *proc); + + /* compatibility */ + void (*op_BX)(fz_context *ctx, pdf_processor *proc); + void (*op_EX)(fz_context *ctx, pdf_processor *proc); + + /* Virtual ops for ExtGState entries */ + void (*op_gs_OP)(fz_context *ctx, pdf_processor *proc, int b); + void (*op_gs_op)(fz_context *ctx, pdf_processor *proc, int b); + void (*op_gs_OPM)(fz_context *ctx, pdf_processor *proc, int i); + void (*op_gs_UseBlackPtComp)(fz_context *ctx, pdf_processor *proc, pdf_obj *name); + + /* END is used to signify end of stream (finalise and close down) */ + void (*op_END)(fz_context *ctx, pdf_processor *proc); + + /* interpreter state that persists across content streams */ + const char *usage; + int hidden; +}; + +typedef struct +{ + /* input */ + pdf_document *doc; + pdf_obj *rdb; + pdf_lexbuf *buf; + fz_cookie *cookie; + + /* state */ + int gstate; + int xbalance; + int in_text; + fz_rect d1_rect; + + /* stack */ + pdf_obj *obj; + char name[256]; + char string[256]; + size_t string_len; + int top; + float stack[32]; +} pdf_csi; + +/* Functions to set up pdf_process structures */ + +pdf_processor *pdf_new_run_processor(fz_context *ctx, pdf_document *doc, fz_device *dev, fz_matrix ctm, int struct_parent, const char *usage, pdf_gstate *gstate, fz_default_colorspaces *default_cs, fz_cookie *cookie); + +/* + Create a buffer processor. + + This collects the incoming PDF operator stream into an fz_buffer. + + buffer: The (possibly empty) buffer to which operators will be + appended. + + ahxencode: If 0, then image streams will be send as binary, + otherwise they will be asciihexencoded. +*/ +pdf_processor *pdf_new_buffer_processor(fz_context *ctx, fz_buffer *buffer, int ahxencode); + +/* + Create an output processor. This + sends the incoming PDF operator stream to an fz_output stream. + + out: The output stream to which operators will be sent. + + ahxencode: If 0, then image streams will be send as binary, + otherwise they will be asciihexencoded. +*/ +pdf_processor *pdf_new_output_processor(fz_context *ctx, fz_output *out, int ahxencode); + +typedef struct pdf_filter_options pdf_filter_options; + +/* + Create a filter processor. This filters the PDF operators + it is fed, and passes them down (with some changes) to the + child filter. + + chain: The child processor to which the filtered operators + will be fed. + + The options field contains a pointer to a structure with + filter specific options in. +*/ +typedef pdf_processor *(pdf_filter_factory_fn)(fz_context *ctx, pdf_document *doc, pdf_processor *chain, int struct_parents, fz_matrix transform, pdf_filter_options *options, void *factory_options); + +/* + A pdf_filter_factory is a pdf_filter_factory_fn, plus the options + needed to instantiate it. +*/ +typedef struct +{ + pdf_filter_factory_fn *filter; + void *options; +} pdf_filter_factory; + +/* + recurse: Filter resources recursively. + + instance_forms: Always recurse on XObject Form resources, but will + create a new instance of each XObject Form that is used, filtered + individually. + + ascii: If true, escape all binary data in the output. + + no_update: If true, do not update the document at the end. + + opaque: Opaque value that is passed to the complete function. + + complete: A function called at the end of processing. + This allows the caller to insert some extra content after + all other content. + + filters: Pointer to an array of filter factory/options. + The array is terminated by an entry with a NULL factory pointer. + Operators will be fed into the filter generated from the first + factory function in the list, and from there go to the filter + generated from the second factory in the list etc. +*/ +struct pdf_filter_options +{ + int recurse; + int instance_forms; + int ascii; + int no_update; + + void *opaque; + void (*complete)(fz_context *ctx, fz_buffer *buffer, void *arg); + + pdf_filter_factory *filters; +}; + +typedef enum +{ + FZ_CULL_PATH_FILL, + FZ_CULL_PATH_STROKE, + FZ_CULL_PATH_FILL_STROKE, + FZ_CULL_CLIP_PATH, + FZ_CULL_GLYPH, + FZ_CULL_IMAGE, + FZ_CULL_SHADING +} fz_cull_type; + +/* + image_filter: A function called to assess whether a given + image should be removed or not. + + text_filter: A function called to assess whether a given + character should be removed or not. + + after_text_object: A function called after each text object. + This allows the caller to insert some extra content if + desired. + + culler: A function called to see whether each object should + be culled or not. +*/ +typedef struct +{ + void *opaque; + fz_image *(*image_filter)(fz_context *ctx, void *opaque, fz_matrix ctm, const char *name, fz_image *image); + int (*text_filter)(fz_context *ctx, void *opaque, int *ucsbuf, int ucslen, fz_matrix trm, fz_matrix ctm, fz_rect bbox); + void (*after_text_object)(fz_context *ctx, void *opaque, pdf_document *doc, pdf_processor *chain, fz_matrix ctm); + int (*culler)(fz_context *ctx, void *opaque, fz_rect bbox, fz_cull_type type); +} +pdf_sanitize_filter_options; + +/* + A sanitize filter factory. + + sopts = pointer to pdf_sanitize_filter_options. + + The changes made by a filter generated from this are: + + * No operations are allowed to change the top level gstate. + Additional q/Q operators are inserted to prevent this. + + * Repeated/unnecessary colour operators are removed (so, + for example, "0 0 0 rg 0 1 rg 0.5 g" would be sanitised to + "0.5 g") + + The intention of these changes is to provide a simpler, + but equivalent stream, repairing problems with mismatched + operators, maintaining structure (such as BMC, EMC calls) + and leaving the graphics state in an known (default) state + so that subsequent operations (such as synthesising new + operators to be appended to the stream) are easier. + + The net graphical effect of the filtered operator stream + should be identical to the incoming operator stream. +*/ +pdf_processor *pdf_new_sanitize_filter(fz_context *ctx, pdf_document *doc, pdf_processor *chain, int struct_parents, fz_matrix transform, pdf_filter_options *options, void *sopts); + +pdf_obj *pdf_filter_xobject_instance(fz_context *ctx, pdf_obj *old_xobj, pdf_obj *page_res, fz_matrix ctm, pdf_filter_options *options, pdf_cycle_list *cycle_up); + +void pdf_processor_push_resources(fz_context *ctx, pdf_processor *proc, pdf_obj *res); + +pdf_obj *pdf_processor_pop_resources(fz_context *ctx, pdf_processor *proc); + +/* + opaque: Opaque value that is passed to all the filter functions. + + color_rewrite: function pointer called to rewrite a color + On entry: + *cs = reference to a pdf object representing the colorspace. + + *n = number of color components + + color = *n color values. + + On exit: + *cs either the same (for no change in colorspace) or + updated to be a new one. Reference must be dropped, and + a new kept reference returned! + + *n = number of color components (maybe updated) + + color = *n color values (maybe updated) + + image_rewrite: function pointer called to rewrite an image + On entry: + *image = reference to an fz_image. + + On exit: + *image either the same (for no change) or updated + to be a new one. Reference must be dropped, and a + new kept reference returned. +*/ +typedef struct +{ + void *opaque; + void (*color_rewrite)(fz_context *ctx, void *opaque, pdf_obj **cs, int *n, float color[FZ_MAX_COLORS]); + void (*image_rewrite)(fz_context *ctx, void *opaque, fz_image **image); + pdf_shade_recolorer *shade_rewrite; +} pdf_color_filter_options; + +pdf_processor * +pdf_new_color_filter(fz_context *ctx, pdf_document *doc, pdf_processor *chain, int struct_parents, fz_matrix transform, pdf_filter_options *options, void *copts); + +/* + Functions to actually process annotations, glyphs and general stream objects. +*/ +void pdf_process_contents(fz_context *ctx, pdf_processor *proc, pdf_document *doc, pdf_obj *obj, pdf_obj *res, fz_cookie *cookie, pdf_obj **out_res); +void pdf_process_annot(fz_context *ctx, pdf_processor *proc, pdf_annot *annot, fz_cookie *cookie); +void pdf_process_glyph(fz_context *ctx, pdf_processor *proc, pdf_document *doc, pdf_obj *resources, fz_buffer *contents); + +/* + Function to process a contents stream without handling the resources. + The caller is responsible for pushing/popping the resources. +*/ +void pdf_process_raw_contents(fz_context *ctx, pdf_processor *proc, pdf_document *doc, pdf_obj *rdb, pdf_obj *stmobj, fz_cookie *cookie); + +/* Text handling helper functions */ +typedef struct +{ + float char_space; + float word_space; + float scale; + float leading; + pdf_font_desc *font; + float size; + int render; + float rise; +} pdf_text_state; + +typedef struct +{ + fz_text *text; + fz_rect text_bbox; + fz_matrix tlm; + fz_matrix tm; + int text_mode; + + int cid; + int gid; + fz_rect char_bbox; + pdf_font_desc *fontdesc; + float char_tx; + float char_ty; +} pdf_text_object_state; + +void pdf_tos_save(fz_context *ctx, pdf_text_object_state *tos, fz_matrix save[2]); +void pdf_tos_restore(fz_context *ctx, pdf_text_object_state *tos, fz_matrix save[2]); +fz_text *pdf_tos_get_text(fz_context *ctx, pdf_text_object_state *tos); +void pdf_tos_reset(fz_context *ctx, pdf_text_object_state *tos, int render); +int pdf_tos_make_trm(fz_context *ctx, pdf_text_object_state *tos, pdf_text_state *text, pdf_font_desc *fontdesc, int cid, fz_matrix *trm); +void pdf_tos_move_after_char(fz_context *ctx, pdf_text_object_state *tos); +void pdf_tos_translate(pdf_text_object_state *tos, float tx, float ty); +void pdf_tos_set_matrix(pdf_text_object_state *tos, float a, float b, float c, float d, float e, float f); +void pdf_tos_newline(pdf_text_object_state *tos, float leading); + +#endif diff --git a/include/mupdf/pdf/javascript.h b/include/mupdf/pdf/javascript.h new file mode 100644 index 0000000..0de3a73 --- /dev/null +++ b/include/mupdf/pdf/javascript.h @@ -0,0 +1,43 @@ +// Copyright (C) 2004-2022 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_PDF_JAVASCRIPT_H +#define MUPDF_PDF_JAVASCRIPT_H + +#include "mupdf/pdf/document.h" +#include "mupdf/pdf/form.h" + +void pdf_enable_js(fz_context *ctx, pdf_document *doc); +void pdf_disable_js(fz_context *ctx, pdf_document *doc); +int pdf_js_supported(fz_context *ctx, pdf_document *doc); +void pdf_drop_js(fz_context *ctx, pdf_js *js); + +void pdf_js_event_init(pdf_js *js, pdf_obj *target, const char *value, int willCommit); +int pdf_js_event_result(pdf_js *js); +int pdf_js_event_result_validate(pdf_js *js, char **newvalue); +char *pdf_js_event_value(pdf_js *js); +void pdf_js_event_init_keystroke(pdf_js *js, pdf_obj *target, pdf_keystroke_event *evt); +int pdf_js_event_result_keystroke(pdf_js *js, pdf_keystroke_event *evt); + +void pdf_js_execute(pdf_js *js, const char *name, const char *code, char **result); + +#endif diff --git a/include/mupdf/pdf/name-table.h b/include/mupdf/pdf/name-table.h new file mode 100644 index 0000000..fab2ea8 --- /dev/null +++ b/include/mupdf/pdf/name-table.h @@ -0,0 +1,585 @@ +// Copyright (C) 2004-2023 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +/* Alphabetically sorted list of all PDF names to be available as constants */ +PDF_MAKE_NAME("1.2", 1_2) +PDF_MAKE_NAME("3D", 3D) +PDF_MAKE_NAME("A", A) +PDF_MAKE_NAME("A85", A85) +PDF_MAKE_NAME("AA", AA) +PDF_MAKE_NAME("AC", AC) +PDF_MAKE_NAME("AESV2", AESV2) +PDF_MAKE_NAME("AESV3", AESV3) +PDF_MAKE_NAME("AHx", AHx) +PDF_MAKE_NAME("AP", AP) +PDF_MAKE_NAME("AS", AS) +PDF_MAKE_NAME("ASCII85Decode", ASCII85Decode) +PDF_MAKE_NAME("ASCIIHexDecode", ASCIIHexDecode) +PDF_MAKE_NAME("AcroForm", AcroForm) +PDF_MAKE_NAME("Action", Action) +PDF_MAKE_NAME("ActualText", ActualText) +PDF_MAKE_NAME("Adobe.PPKLite", Adobe_PPKLite) +PDF_MAKE_NAME("All", All) +PDF_MAKE_NAME("AllOff", AllOff) +PDF_MAKE_NAME("AllOn", AllOn) +PDF_MAKE_NAME("Alpha", Alpha) +PDF_MAKE_NAME("Alt", Alt) +PDF_MAKE_NAME("Alternate", Alternate) +PDF_MAKE_NAME("Annot", Annot) +PDF_MAKE_NAME("Annots", Annots) +PDF_MAKE_NAME("AnyOff", AnyOff) +PDF_MAKE_NAME("App", App) +PDF_MAKE_NAME("Approved", Approved) +PDF_MAKE_NAME("Art", Art) +PDF_MAKE_NAME("ArtBox", ArtBox) +PDF_MAKE_NAME("Artifact", Artifact) +PDF_MAKE_NAME("AsIs", AsIs) +PDF_MAKE_NAME("Ascent", Ascent) +PDF_MAKE_NAME("Aside", Aside) +PDF_MAKE_NAME("AuthEvent", AuthEvent) +PDF_MAKE_NAME("Author", Author) +PDF_MAKE_NAME("B", B) +PDF_MAKE_NAME("BBox", BBox) +PDF_MAKE_NAME("BC", BC) +PDF_MAKE_NAME("BE", BE) +PDF_MAKE_NAME("BG", BG) +PDF_MAKE_NAME("BM", BM) +PDF_MAKE_NAME("BPC", BPC) +PDF_MAKE_NAME("BS", BS) +PDF_MAKE_NAME("Background", Background) +PDF_MAKE_NAME("BaseEncoding", BaseEncoding) +PDF_MAKE_NAME("BaseFont", BaseFont) +PDF_MAKE_NAME("BaseState", BaseState) +PDF_MAKE_NAME("BibEntry", BibEntry) +PDF_MAKE_NAME("BitsPerComponent", BitsPerComponent) +PDF_MAKE_NAME("BitsPerCoordinate", BitsPerCoordinate) +PDF_MAKE_NAME("BitsPerFlag", BitsPerFlag) +PDF_MAKE_NAME("BitsPerSample", BitsPerSample) +PDF_MAKE_NAME("BlackIs1", BlackIs1) +PDF_MAKE_NAME("BlackPoint", BlackPoint) +PDF_MAKE_NAME("BleedBox", BleedBox) +PDF_MAKE_NAME("Blinds", Blinds) +PDF_MAKE_NAME("BlockQuote", BlockQuote) +PDF_MAKE_NAME("Border", Border) +PDF_MAKE_NAME("Bounds", Bounds) +PDF_MAKE_NAME("Box", Box) +PDF_MAKE_NAME("Bt", Bt) +PDF_MAKE_NAME("Btn", Btn) +PDF_MAKE_NAME("Butt", Butt) +PDF_MAKE_NAME("ByteRange", ByteRange) +PDF_MAKE_NAME("C", C) +PDF_MAKE_NAME("C0", C0) +PDF_MAKE_NAME("C1", C1) +PDF_MAKE_NAME("CA", CA) +PDF_MAKE_NAME("CCF", CCF) +PDF_MAKE_NAME("CCITTFaxDecode", CCITTFaxDecode) +PDF_MAKE_NAME("CF", CF) +PDF_MAKE_NAME("CFM", CFM) +PDF_MAKE_NAME("CI", CI) +PDF_MAKE_NAME("CIDFontType0", CIDFontType0) +PDF_MAKE_NAME("CIDFontType0C", CIDFontType0C) +PDF_MAKE_NAME("CIDFontType2", CIDFontType2) +PDF_MAKE_NAME("CIDSystemInfo", CIDSystemInfo) +PDF_MAKE_NAME("CIDToGIDMap", CIDToGIDMap) +PDF_MAKE_NAME("CMYK", CMYK) +PDF_MAKE_NAME("CS", CS) +PDF_MAKE_NAME("CalCMYK", CalCMYK) +PDF_MAKE_NAME("CalGray", CalGray) +PDF_MAKE_NAME("CalRGB", CalRGB) +PDF_MAKE_NAME("CapHeight", CapHeight) +PDF_MAKE_NAME("Caption", Caption) +PDF_MAKE_NAME("Caret", Caret) +PDF_MAKE_NAME("Catalog", Catalog) +PDF_MAKE_NAME("Cert", Cert) +PDF_MAKE_NAME("Ch", Ch) +PDF_MAKE_NAME("Changes", Changes) +PDF_MAKE_NAME("CharProcs", CharProcs) +PDF_MAKE_NAME("CheckSum", CheckSum) +PDF_MAKE_NAME("Circle", Circle) +PDF_MAKE_NAME("ClosedArrow", ClosedArrow) +PDF_MAKE_NAME("Code", Code) +PDF_MAKE_NAME("Collection", Collection) +PDF_MAKE_NAME("ColorSpace", ColorSpace) +PDF_MAKE_NAME("ColorTransform", ColorTransform) +PDF_MAKE_NAME("Colorants", Colorants) +PDF_MAKE_NAME("Colors", Colors) +PDF_MAKE_NAME("Columns", Columns) +PDF_MAKE_NAME("Confidential", Confidential) +PDF_MAKE_NAME("Configs", Configs) +PDF_MAKE_NAME("ContactInfo", ContactInfo) +PDF_MAKE_NAME("Contents", Contents) +PDF_MAKE_NAME("Coords", Coords) +PDF_MAKE_NAME("Count", Count) +PDF_MAKE_NAME("Cover", Cover) +PDF_MAKE_NAME("CreationDate", CreationDate) +PDF_MAKE_NAME("Creator", Creator) +PDF_MAKE_NAME("CropBox", CropBox) +PDF_MAKE_NAME("Crypt", Crypt) +PDF_MAKE_NAME("D", D) +PDF_MAKE_NAME("DA", DA) +PDF_MAKE_NAME("DC", DC) +PDF_MAKE_NAME("DCT", DCT) +PDF_MAKE_NAME("DCTDecode", DCTDecode) +PDF_MAKE_NAME("DL", DL) +PDF_MAKE_NAME("DOS", DOS) +PDF_MAKE_NAME("DP", DP) +PDF_MAKE_NAME("DR", DR) +PDF_MAKE_NAME("DS", DS) +PDF_MAKE_NAME("DV", DV) +PDF_MAKE_NAME("DW", DW) +PDF_MAKE_NAME("DW2", DW2) +PDF_MAKE_NAME("DamagedRowsBeforeError", DamagedRowsBeforeError) +PDF_MAKE_NAME("Data", Data) +PDF_MAKE_NAME("Date", Date) +PDF_MAKE_NAME("Decode", Decode) +PDF_MAKE_NAME("DecodeParms", DecodeParms) +PDF_MAKE_NAME("Default", Default) +PDF_MAKE_NAME("DefaultCMYK", DefaultCMYK) +PDF_MAKE_NAME("DefaultGray", DefaultGray) +PDF_MAKE_NAME("DefaultRGB", DefaultRGB) +PDF_MAKE_NAME("Departmental", Departmental) +PDF_MAKE_NAME("Desc", Desc) +PDF_MAKE_NAME("DescendantFonts", DescendantFonts) +PDF_MAKE_NAME("Descent", Descent) +PDF_MAKE_NAME("Design", Design) +PDF_MAKE_NAME("Dest", Dest) +PDF_MAKE_NAME("DestOutputProfile", DestOutputProfile) +PDF_MAKE_NAME("Dests", Dests) +PDF_MAKE_NAME("DeviceCMYK", DeviceCMYK) +PDF_MAKE_NAME("DeviceGray", DeviceGray) +PDF_MAKE_NAME("DeviceN", DeviceN) +PDF_MAKE_NAME("DeviceRGB", DeviceRGB) +PDF_MAKE_NAME("Di", Di) +PDF_MAKE_NAME("Diamond", Diamond) +PDF_MAKE_NAME("Differences", Differences) +PDF_MAKE_NAME("DigestLocation", DigestLocation) +PDF_MAKE_NAME("DigestMethod", DigestMethod) +PDF_MAKE_NAME("DigestValue", DigestValue) +PDF_MAKE_NAME("Dissolve", Dissolve) +PDF_MAKE_NAME("Div", Div) +PDF_MAKE_NAME("Dm", Dm) +PDF_MAKE_NAME("DocMDP", DocMDP) +PDF_MAKE_NAME("Document", Document) +PDF_MAKE_NAME("DocumentFragment", DocumentFragment) +PDF_MAKE_NAME("Domain", Domain) +PDF_MAKE_NAME("Draft", Draft) +PDF_MAKE_NAME("Dur", Dur) +PDF_MAKE_NAME("E", E) +PDF_MAKE_NAME("EF", EF) +PDF_MAKE_NAME("EarlyChange", EarlyChange) +PDF_MAKE_NAME("Em", Em) +PDF_MAKE_NAME("EmbeddedFile", EmbeddedFile) +PDF_MAKE_NAME("EmbeddedFiles", EmbeddedFiles) +PDF_MAKE_NAME("Encode", Encode) +PDF_MAKE_NAME("EncodedByteAlign", EncodedByteAlign) +PDF_MAKE_NAME("Encoding", Encoding) +PDF_MAKE_NAME("Encrypt", Encrypt) +PDF_MAKE_NAME("EncryptMetadata", EncryptMetadata) +PDF_MAKE_NAME("EndOfBlock", EndOfBlock) +PDF_MAKE_NAME("EndOfLine", EndOfLine) +PDF_MAKE_NAME("Exclude", Exclude) +PDF_MAKE_NAME("Experimental", Experimental) +PDF_MAKE_NAME("Expired", Expired) +PDF_MAKE_NAME("ExtGState", ExtGState) +PDF_MAKE_NAME("Extend", Extend) +PDF_MAKE_NAME("F", F) +PDF_MAKE_NAME("FENote", FENote) +PDF_MAKE_NAME("FL", FL) +PDF_MAKE_NAME("FRM", FRM) +PDF_MAKE_NAME("FS", FS) +PDF_MAKE_NAME("FT", FT) +PDF_MAKE_NAME("Fade", Fade) +PDF_MAKE_NAME("Ff", Ff) +PDF_MAKE_NAME("FieldMDP", FieldMDP) +PDF_MAKE_NAME("Fields", Fields) +PDF_MAKE_NAME("Figure", Figure) +PDF_MAKE_NAME("FileAttachment", FileAttachment) +PDF_MAKE_NAME("FileSize", FileSize) +PDF_MAKE_NAME("Filespec", Filespec) +PDF_MAKE_NAME("Filter", Filter) +PDF_MAKE_NAME("Final", Final) +PDF_MAKE_NAME("Fingerprint", Fingerprint) +PDF_MAKE_NAME("First", First) +PDF_MAKE_NAME("FirstChar", FirstChar) +PDF_MAKE_NAME("FirstPage", FirstPage) +PDF_MAKE_NAME("Fit", Fit) +PDF_MAKE_NAME("FitB", FitB) +PDF_MAKE_NAME("FitBH", FitBH) +PDF_MAKE_NAME("FitBV", FitBV) +PDF_MAKE_NAME("FitH", FitH) +PDF_MAKE_NAME("FitR", FitR) +PDF_MAKE_NAME("FitV", FitV) +PDF_MAKE_NAME("Fl", Fl) +PDF_MAKE_NAME("Flags", Flags) +PDF_MAKE_NAME("FlateDecode", FlateDecode) +PDF_MAKE_NAME("Fly", Fly) +PDF_MAKE_NAME("Font", Font) +PDF_MAKE_NAME("FontBBox", FontBBox) +PDF_MAKE_NAME("FontDescriptor", FontDescriptor) +PDF_MAKE_NAME("FontFile", FontFile) +PDF_MAKE_NAME("FontFile2", FontFile2) +PDF_MAKE_NAME("FontFile3", FontFile3) +PDF_MAKE_NAME("FontMatrix", FontMatrix) +PDF_MAKE_NAME("FontName", FontName) +PDF_MAKE_NAME("ForComment", ForComment) +PDF_MAKE_NAME("ForPublicRelease", ForPublicRelease) +PDF_MAKE_NAME("Form", Form) +PDF_MAKE_NAME("FormEx", FormEx) +PDF_MAKE_NAME("FormType", FormType) +PDF_MAKE_NAME("Formula", Formula) +PDF_MAKE_NAME("FreeText", FreeText) +PDF_MAKE_NAME("Function", Function) +PDF_MAKE_NAME("FunctionType", FunctionType) +PDF_MAKE_NAME("Functions", Functions) +PDF_MAKE_NAME("G", G) +PDF_MAKE_NAME("GTS_PDFX", GTS_PDFX) +PDF_MAKE_NAME("Gamma", Gamma) +PDF_MAKE_NAME("Glitter", Glitter) +PDF_MAKE_NAME("GoTo", GoTo) +PDF_MAKE_NAME("GoToR", GoToR) +PDF_MAKE_NAME("Group", Group) +PDF_MAKE_NAME("H", H) +PDF_MAKE_NAME("H1", H1) +PDF_MAKE_NAME("H2", H2) +PDF_MAKE_NAME("H3", H3) +PDF_MAKE_NAME("H4", H4) +PDF_MAKE_NAME("H5", H5) +PDF_MAKE_NAME("H6", H6) +PDF_MAKE_NAME("Height", Height) +PDF_MAKE_NAME("Helv", Helv) +PDF_MAKE_NAME("Highlight", Highlight) +PDF_MAKE_NAME("HistoryPos", HistoryPos) +PDF_MAKE_NAME("I", I) +PDF_MAKE_NAME("IC", IC) +PDF_MAKE_NAME("ICCBased", ICCBased) +PDF_MAKE_NAME("ID", ID) +PDF_MAKE_NAME("IM", IM) +PDF_MAKE_NAME("IRT", IRT) +PDF_MAKE_NAME("Identity", Identity) +PDF_MAKE_NAME("Identity-H", Identity_H) +PDF_MAKE_NAME("Identity-V", Identity_V) +PDF_MAKE_NAME("Image", Image) +PDF_MAKE_NAME("ImageB", ImageB) +PDF_MAKE_NAME("ImageC", ImageC) +PDF_MAKE_NAME("ImageI", ImageI) +PDF_MAKE_NAME("ImageMask", ImageMask) +PDF_MAKE_NAME("Include", Include) +PDF_MAKE_NAME("Index", Index) +PDF_MAKE_NAME("Indexed", Indexed) +PDF_MAKE_NAME("Info", Info) +PDF_MAKE_NAME("Ink", Ink) +PDF_MAKE_NAME("InkList", InkList) +PDF_MAKE_NAME("Intent", Intent) +PDF_MAKE_NAME("Interpolate", Interpolate) +PDF_MAKE_NAME("IsMap", IsMap) +PDF_MAKE_NAME("ItalicAngle", ItalicAngle) +PDF_MAKE_NAME("JBIG2Decode", JBIG2Decode) +PDF_MAKE_NAME("JBIG2Globals", JBIG2Globals) +PDF_MAKE_NAME("JPXDecode", JPXDecode) +PDF_MAKE_NAME("JS", JS) +PDF_MAKE_NAME("JavaScript", JavaScript) +PDF_MAKE_NAME("K", K) +PDF_MAKE_NAME("Keywords", Keywords) +PDF_MAKE_NAME("Kids", Kids) +PDF_MAKE_NAME("L", L) +PDF_MAKE_NAME("LBody", LBody) +PDF_MAKE_NAME("LC", LC) +PDF_MAKE_NAME("LE", LE) +PDF_MAKE_NAME("LI", LI) +PDF_MAKE_NAME("LJ", LJ) +PDF_MAKE_NAME("LW", LW) +PDF_MAKE_NAME("LZ", LZ) +PDF_MAKE_NAME("LZW", LZW) +PDF_MAKE_NAME("LZWDecode", LZWDecode) +PDF_MAKE_NAME("Lab", Lab) +PDF_MAKE_NAME("Label", Label) +PDF_MAKE_NAME("Lang", Lang) +PDF_MAKE_NAME("Last", Last) +PDF_MAKE_NAME("LastChar", LastChar) +PDF_MAKE_NAME("LastPage", LastPage) +PDF_MAKE_NAME("Launch", Launch) +PDF_MAKE_NAME("Layer", Layer) +PDF_MAKE_NAME("Lbl", Lbl) +PDF_MAKE_NAME("Length", Length) +PDF_MAKE_NAME("Length1", Length1) +PDF_MAKE_NAME("Length2", Length2) +PDF_MAKE_NAME("Length3", Length3) +PDF_MAKE_NAME("Limits", Limits) +PDF_MAKE_NAME("Line", Line) +PDF_MAKE_NAME("Linearized", Linearized) +PDF_MAKE_NAME("Link", Link) +PDF_MAKE_NAME("List", List) +PDF_MAKE_NAME("Location", Location) +PDF_MAKE_NAME("Lock", Lock) +PDF_MAKE_NAME("Locked", Locked) +PDF_MAKE_NAME("Luminosity", Luminosity) +PDF_MAKE_NAME("M", M) +PDF_MAKE_NAME("MCID", MCID) +PDF_MAKE_NAME("MK", MK) +PDF_MAKE_NAME("ML", ML) +PDF_MAKE_NAME("MMType1", MMType1) +PDF_MAKE_NAME("Mac", Mac) +PDF_MAKE_NAME("Mask", Mask) +PDF_MAKE_NAME("Matrix", Matrix) +PDF_MAKE_NAME("Matte", Matte) +PDF_MAKE_NAME("MaxLen", MaxLen) +PDF_MAKE_NAME("MediaBox", MediaBox) +PDF_MAKE_NAME("Metadata", Metadata) +PDF_MAKE_NAME("MissingWidth", MissingWidth) +PDF_MAKE_NAME("ModDate", ModDate) +PDF_MAKE_NAME("Movie", Movie) +PDF_MAKE_NAME("Msg", Msg) +PDF_MAKE_NAME("Multiply", Multiply) +PDF_MAKE_NAME("N", N) +PDF_MAKE_NAME("Name", Name) +PDF_MAKE_NAME("Named", Named) +PDF_MAKE_NAME("Names", Names) +PDF_MAKE_NAME("NewWindow", NewWindow) +PDF_MAKE_NAME("Next", Next) +PDF_MAKE_NAME("NextPage", NextPage) +PDF_MAKE_NAME("NonEFontNoWarn", NonEFontNoWarn) +PDF_MAKE_NAME("NonStruct", NonStruct) +PDF_MAKE_NAME("None", None) +PDF_MAKE_NAME("Normal", Normal) +PDF_MAKE_NAME("NotApproved", NotApproved) +PDF_MAKE_NAME("NotForPublicRelease", NotForPublicRelease) +PDF_MAKE_NAME("Note", Note) +PDF_MAKE_NAME("NumSections", NumSections) +PDF_MAKE_NAME("Nums", Nums) +PDF_MAKE_NAME("O", O) +PDF_MAKE_NAME("OC", OC) +PDF_MAKE_NAME("OCG", OCG) +PDF_MAKE_NAME("OCGs", OCGs) +PDF_MAKE_NAME("OCMD", OCMD) +PDF_MAKE_NAME("OCProperties", OCProperties) +PDF_MAKE_NAME("OE", OE) +PDF_MAKE_NAME("OFF", OFF) +PDF_MAKE_NAME("ON", ON) +PDF_MAKE_NAME("OP", OP) +PDF_MAKE_NAME("OPM", OPM) +PDF_MAKE_NAME("OS", OS) +PDF_MAKE_NAME("ObjStm", ObjStm) +PDF_MAKE_NAME("Of", Of) +PDF_MAKE_NAME("Off", Off) +PDF_MAKE_NAME("Open", Open) +PDF_MAKE_NAME("OpenArrow", OpenArrow) +PDF_MAKE_NAME("OpenType", OpenType) +PDF_MAKE_NAME("Opt", Opt) +PDF_MAKE_NAME("Order", Order) +PDF_MAKE_NAME("Ordering", Ordering) +PDF_MAKE_NAME("Outlines", Outlines) +PDF_MAKE_NAME("OutputCondition", OutputCondition) +PDF_MAKE_NAME("OutputConditionIdentifier", OutputConditionIdentifier) +PDF_MAKE_NAME("OutputIntent", OutputIntent) +PDF_MAKE_NAME("OutputIntents", OutputIntents) +PDF_MAKE_NAME("P", P) +PDF_MAKE_NAME("PDF", PDF) +PDF_MAKE_NAME("PS", PS) +PDF_MAKE_NAME("Page", Page) +PDF_MAKE_NAME("PageLabels", PageLabels) +PDF_MAKE_NAME("PageMode", PageMode) +PDF_MAKE_NAME("Pages", Pages) +PDF_MAKE_NAME("PaintType", PaintType) +PDF_MAKE_NAME("Params", Params) +PDF_MAKE_NAME("Parent", Parent) +PDF_MAKE_NAME("ParentTree", ParentTree) +PDF_MAKE_NAME("Part", Part) +PDF_MAKE_NAME("Pattern", Pattern) +PDF_MAKE_NAME("PatternType", PatternType) +PDF_MAKE_NAME("Perms", Perms) +PDF_MAKE_NAME("PolyLine", PolyLine) +PDF_MAKE_NAME("Polygon", Polygon) +PDF_MAKE_NAME("Popup", Popup) +PDF_MAKE_NAME("PreRelease", PreRelease) +PDF_MAKE_NAME("Predictor", Predictor) +PDF_MAKE_NAME("Prev", Prev) +PDF_MAKE_NAME("PrevPage", PrevPage) +PDF_MAKE_NAME("Preview", Preview) +PDF_MAKE_NAME("Print", Print) +PDF_MAKE_NAME("PrinterMark", PrinterMark) +PDF_MAKE_NAME("Private", Private) +PDF_MAKE_NAME("ProcSet", ProcSet) +PDF_MAKE_NAME("Producer", Producer) +PDF_MAKE_NAME("Prop_AuthTime", Prop_AuthTime) +PDF_MAKE_NAME("Prop_AuthType", Prop_AuthType) +PDF_MAKE_NAME("Prop_Build", Prop_Build) +PDF_MAKE_NAME("Properties", Properties) +PDF_MAKE_NAME("PubSec", PubSec) +PDF_MAKE_NAME("Push", Push) +PDF_MAKE_NAME("Q", Q) +PDF_MAKE_NAME("QuadPoints", QuadPoints) +PDF_MAKE_NAME("Quote", Quote) +PDF_MAKE_NAME("R", R) +PDF_MAKE_NAME("RB", RB) +PDF_MAKE_NAME("RBGroups", RBGroups) +PDF_MAKE_NAME("RC", RC) +PDF_MAKE_NAME("RClosedArrow", RClosedArrow) +PDF_MAKE_NAME("RD", RD) +PDF_MAKE_NAME("REx", REx) +PDF_MAKE_NAME("RGB", RGB) +PDF_MAKE_NAME("RI", RI) +PDF_MAKE_NAME("RL", RL) +PDF_MAKE_NAME("ROpenArrow", ROpenArrow) +PDF_MAKE_NAME("RP", RP) +PDF_MAKE_NAME("RT", RT) +PDF_MAKE_NAME("Range", Range) +PDF_MAKE_NAME("Reason", Reason) +PDF_MAKE_NAME("Rect", Rect) +PDF_MAKE_NAME("Redact", Redact) +PDF_MAKE_NAME("Ref", Ref) +PDF_MAKE_NAME("Reference", Reference) +PDF_MAKE_NAME("Registry", Registry) +PDF_MAKE_NAME("ResetForm", ResetForm) +PDF_MAKE_NAME("Resources", Resources) +PDF_MAKE_NAME("RoleMap", RoleMap) +PDF_MAKE_NAME("Root", Root) +PDF_MAKE_NAME("Rotate", Rotate) +PDF_MAKE_NAME("Rows", Rows) +PDF_MAKE_NAME("Ruby", Ruby) +PDF_MAKE_NAME("RunLengthDecode", RunLengthDecode) +PDF_MAKE_NAME("S", S) +PDF_MAKE_NAME("SMask", SMask) +PDF_MAKE_NAME("SMaskInData", SMaskInData) +PDF_MAKE_NAME("Schema", Schema) +PDF_MAKE_NAME("Screen", Screen) +PDF_MAKE_NAME("Sect", Sect) +PDF_MAKE_NAME("Separation", Separation) +PDF_MAKE_NAME("Shading", Shading) +PDF_MAKE_NAME("ShadingType", ShadingType) +PDF_MAKE_NAME("Si", Si) +PDF_MAKE_NAME("Sig", Sig) +PDF_MAKE_NAME("SigFlags", SigFlags) +PDF_MAKE_NAME("SigQ", SigQ) +PDF_MAKE_NAME("SigRef", SigRef) +PDF_MAKE_NAME("Size", Size) +PDF_MAKE_NAME("Slash", Slash) +PDF_MAKE_NAME("Sold", Sold) +PDF_MAKE_NAME("Sound", Sound) +PDF_MAKE_NAME("Span", Span) +PDF_MAKE_NAME("Split", Split) +PDF_MAKE_NAME("Square", Square) +PDF_MAKE_NAME("Squiggly", Squiggly) +PDF_MAKE_NAME("St", St) +PDF_MAKE_NAME("Stamp", Stamp) +PDF_MAKE_NAME("Standard", Standard) +PDF_MAKE_NAME("StdCF", StdCF) +PDF_MAKE_NAME("StemV", StemV) +PDF_MAKE_NAME("StmF", StmF) +PDF_MAKE_NAME("StrF", StrF) +PDF_MAKE_NAME("StrikeOut", StrikeOut) +PDF_MAKE_NAME("Strong", Strong) +PDF_MAKE_NAME("StructParent", StructParent) +PDF_MAKE_NAME("StructParents", StructParents) +PDF_MAKE_NAME("StructTreeRoot", StructTreeRoot) +PDF_MAKE_NAME("Sub", Sub) +PDF_MAKE_NAME("SubFilter", SubFilter) +PDF_MAKE_NAME("Subject", Subject) +PDF_MAKE_NAME("Subtype", Subtype) +PDF_MAKE_NAME("Subtype2", Subtype2) +PDF_MAKE_NAME("Supplement", Supplement) +PDF_MAKE_NAME("Symb", Symb) +PDF_MAKE_NAME("T", T) +PDF_MAKE_NAME("TBody", TBody) +PDF_MAKE_NAME("TD", TD) +PDF_MAKE_NAME("TFoot", TFoot) +PDF_MAKE_NAME("TH", TH) +PDF_MAKE_NAME("THead", THead) +PDF_MAKE_NAME("TI", TI) +PDF_MAKE_NAME("TOC", TOC) +PDF_MAKE_NAME("TOCI", TOCI) +PDF_MAKE_NAME("TR", TR) +PDF_MAKE_NAME("TR2", TR2) +PDF_MAKE_NAME("TU", TU) +PDF_MAKE_NAME("Table", Table) +PDF_MAKE_NAME("Text", Text) +PDF_MAKE_NAME("TilingType", TilingType) +PDF_MAKE_NAME("Times", Times) +PDF_MAKE_NAME("Title", Title) +PDF_MAKE_NAME("ToUnicode", ToUnicode) +PDF_MAKE_NAME("TopSecret", TopSecret) +PDF_MAKE_NAME("Trans", Trans) +PDF_MAKE_NAME("TransformMethod", TransformMethod) +PDF_MAKE_NAME("TransformParams", TransformParams) +PDF_MAKE_NAME("Transparency", Transparency) +PDF_MAKE_NAME("TrapNet", TrapNet) +PDF_MAKE_NAME("TrimBox", TrimBox) +PDF_MAKE_NAME("TrueType", TrueType) +PDF_MAKE_NAME("TrustedMode", TrustedMode) +PDF_MAKE_NAME("Tx", Tx) +PDF_MAKE_NAME("Type", Type) +PDF_MAKE_NAME("Type0", Type0) +PDF_MAKE_NAME("Type1", Type1) +PDF_MAKE_NAME("Type1C", Type1C) +PDF_MAKE_NAME("Type3", Type3) +PDF_MAKE_NAME("U", U) +PDF_MAKE_NAME("UE", UE) +PDF_MAKE_NAME("UF", UF) +PDF_MAKE_NAME("URI", URI) +PDF_MAKE_NAME("URL", URL) +PDF_MAKE_NAME("Unchanged", Unchanged) +PDF_MAKE_NAME("Uncover", Uncover) +PDF_MAKE_NAME("Underline", Underline) +PDF_MAKE_NAME("Unix", Unix) +PDF_MAKE_NAME("Usage", Usage) +PDF_MAKE_NAME("UseBlackPtComp", UseBlackPtComp) +PDF_MAKE_NAME("UseCMap", UseCMap) +PDF_MAKE_NAME("UseOutlines", UseOutlines) +PDF_MAKE_NAME("UserUnit", UserUnit) +PDF_MAKE_NAME("V", V) +PDF_MAKE_NAME("V2", V2) +PDF_MAKE_NAME("VE", VE) +PDF_MAKE_NAME("Version", Version) +PDF_MAKE_NAME("Vertices", Vertices) +PDF_MAKE_NAME("VerticesPerRow", VerticesPerRow) +PDF_MAKE_NAME("View", View) +PDF_MAKE_NAME("W", W) +PDF_MAKE_NAME("W2", W2) +PDF_MAKE_NAME("WMode", WMode) +PDF_MAKE_NAME("WP", WP) +PDF_MAKE_NAME("WT", WT) +PDF_MAKE_NAME("Warichu", Warichu) +PDF_MAKE_NAME("Watermark", Watermark) +PDF_MAKE_NAME("WhitePoint", WhitePoint) +PDF_MAKE_NAME("Widget", Widget) +PDF_MAKE_NAME("Width", Width) +PDF_MAKE_NAME("Widths", Widths) +PDF_MAKE_NAME("WinAnsiEncoding", WinAnsiEncoding) +PDF_MAKE_NAME("Wipe", Wipe) +PDF_MAKE_NAME("XFA", XFA) +PDF_MAKE_NAME("XHeight", XHeight) +PDF_MAKE_NAME("XML", XML) +PDF_MAKE_NAME("XObject", XObject) +PDF_MAKE_NAME("XRef", XRef) +PDF_MAKE_NAME("XRefStm", XRefStm) +PDF_MAKE_NAME("XStep", XStep) +PDF_MAKE_NAME("XYZ", XYZ) +PDF_MAKE_NAME("YStep", YStep) +PDF_MAKE_NAME("Yes", Yes) +PDF_MAKE_NAME("ZaDb", ZaDb) +PDF_MAKE_NAME("a", a) +PDF_MAKE_NAME("adbe.pkcs7.detached", adbe_pkcs7_detached) +PDF_MAKE_NAME("ca", ca) +PDF_MAKE_NAME("n0", n0) +PDF_MAKE_NAME("n1", n1) +PDF_MAKE_NAME("n2", n2) +PDF_MAKE_NAME("op", op) +PDF_MAKE_NAME("r", r) diff --git a/include/mupdf/pdf/object.h b/include/mupdf/pdf/object.h new file mode 100644 index 0000000..e60dab3 --- /dev/null +++ b/include/mupdf/pdf/object.h @@ -0,0 +1,406 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_PDF_OBJECT_H +#define MUPDF_PDF_OBJECT_H + +#include "mupdf/fitz/stream.h" + +typedef struct pdf_document pdf_document; +typedef struct pdf_crypt pdf_crypt; +typedef struct pdf_journal pdf_journal; + +/* Defined in PDF 1.7 according to Acrobat limit. */ +#define PDF_MAX_OBJECT_NUMBER 8388607 +#define PDF_MAX_GEN_NUMBER 65535 + +/* + * Dynamic objects. + * The same type of objects as found in PDF and PostScript. + * Used by the filters and the mupdf parser. + */ + +typedef struct pdf_obj pdf_obj; + +pdf_obj *pdf_new_int(fz_context *ctx, int64_t i); +pdf_obj *pdf_new_real(fz_context *ctx, float f); +pdf_obj *pdf_new_name(fz_context *ctx, const char *str); +pdf_obj *pdf_new_string(fz_context *ctx, const char *str, size_t len); + +/* + Create a PDF 'text string' by encoding input string as either ASCII or UTF-16BE. + In theory, we could also use PDFDocEncoding. +*/ +pdf_obj *pdf_new_text_string(fz_context *ctx, const char *s); +pdf_obj *pdf_new_indirect(fz_context *ctx, pdf_document *doc, int num, int gen); +pdf_obj *pdf_new_array(fz_context *ctx, pdf_document *doc, int initialcap); +pdf_obj *pdf_new_dict(fz_context *ctx, pdf_document *doc, int initialcap); +pdf_obj *pdf_new_rect(fz_context *ctx, pdf_document *doc, fz_rect rect); +pdf_obj *pdf_new_matrix(fz_context *ctx, pdf_document *doc, fz_matrix mtx); +pdf_obj *pdf_new_date(fz_context *ctx, pdf_document *doc, int64_t time); +pdf_obj *pdf_copy_array(fz_context *ctx, pdf_obj *array); +pdf_obj *pdf_copy_dict(fz_context *ctx, pdf_obj *dict); +pdf_obj *pdf_deep_copy_obj(fz_context *ctx, pdf_obj *obj); + +pdf_obj *pdf_keep_obj(fz_context *ctx, pdf_obj *obj); +void pdf_drop_obj(fz_context *ctx, pdf_obj *obj); +pdf_obj *pdf_drop_singleton_obj(fz_context *ctx, pdf_obj *obj); + +int pdf_is_null(fz_context *ctx, pdf_obj *obj); +int pdf_is_bool(fz_context *ctx, pdf_obj *obj); +int pdf_is_int(fz_context *ctx, pdf_obj *obj); +int pdf_is_real(fz_context *ctx, pdf_obj *obj); +int pdf_is_number(fz_context *ctx, pdf_obj *obj); +int pdf_is_name(fz_context *ctx, pdf_obj *obj); +int pdf_is_string(fz_context *ctx, pdf_obj *obj); +int pdf_is_array(fz_context *ctx, pdf_obj *obj); +int pdf_is_dict(fz_context *ctx, pdf_obj *obj); +int pdf_is_indirect(fz_context *ctx, pdf_obj *obj); + +/* + Check if an object is a stream or not. +*/ +int pdf_obj_num_is_stream(fz_context *ctx, pdf_document *doc, int num); +int pdf_is_stream(fz_context *ctx, pdf_obj *obj); + +/* Compare 2 objects. Returns 0 on match, non-zero on mismatch. + * Streams always mismatch. + */ +int pdf_objcmp(fz_context *ctx, pdf_obj *a, pdf_obj *b); +int pdf_objcmp_resolve(fz_context *ctx, pdf_obj *a, pdf_obj *b); + +/* Compare 2 objects. Returns 0 on match, non-zero on mismatch. + * Stream contents are explicitly checked. + */ +int pdf_objcmp_deep(fz_context *ctx, pdf_obj *a, pdf_obj *b); + +int pdf_name_eq(fz_context *ctx, pdf_obj *a, pdf_obj *b); + +int pdf_obj_marked(fz_context *ctx, pdf_obj *obj); +int pdf_mark_obj(fz_context *ctx, pdf_obj *obj); +void pdf_unmark_obj(fz_context *ctx, pdf_obj *obj); + +typedef struct pdf_cycle_list pdf_cycle_list; +struct pdf_cycle_list { + pdf_cycle_list *up; + int num; +}; +int pdf_cycle(fz_context *ctx, pdf_cycle_list *here, pdf_cycle_list *prev, pdf_obj *obj); + +typedef struct +{ + int len; + unsigned char bits[1]; +} pdf_mark_bits; + +pdf_mark_bits *pdf_new_mark_bits(fz_context *ctx, pdf_document *doc); +void pdf_drop_mark_bits(fz_context *ctx, pdf_mark_bits *marks); +void pdf_mark_bits_reset(fz_context *ctx, pdf_mark_bits *marks); +int pdf_mark_bits_set(fz_context *ctx, pdf_mark_bits *marks, pdf_obj *obj); + +typedef struct +{ + int len; + int max; + int *list; + int local_list[8]; +} pdf_mark_list; + +int pdf_mark_list_push(fz_context *ctx, pdf_mark_list *list, pdf_obj *obj); +void pdf_mark_list_pop(fz_context *ctx, pdf_mark_list *list); +void pdf_mark_list_init(fz_context *ctx, pdf_mark_list *list); +void pdf_mark_list_free(fz_context *ctx, pdf_mark_list *list); + +void pdf_set_obj_memo(fz_context *ctx, pdf_obj *obj, int bit, int memo); +int pdf_obj_memo(fz_context *ctx, pdf_obj *obj, int bit, int *memo); + +int pdf_obj_is_dirty(fz_context *ctx, pdf_obj *obj); +void pdf_dirty_obj(fz_context *ctx, pdf_obj *obj); +void pdf_clean_obj(fz_context *ctx, pdf_obj *obj); + +int pdf_to_bool(fz_context *ctx, pdf_obj *obj); +int pdf_to_int(fz_context *ctx, pdf_obj *obj); +int64_t pdf_to_int64(fz_context *ctx, pdf_obj *obj); +float pdf_to_real(fz_context *ctx, pdf_obj *obj); +const char *pdf_to_name(fz_context *ctx, pdf_obj *obj); +const char *pdf_to_text_string(fz_context *ctx, pdf_obj *obj); +const char *pdf_to_string(fz_context *ctx, pdf_obj *obj, size_t *sizep); +char *pdf_to_str_buf(fz_context *ctx, pdf_obj *obj); +size_t pdf_to_str_len(fz_context *ctx, pdf_obj *obj); +int pdf_to_num(fz_context *ctx, pdf_obj *obj); +int pdf_to_gen(fz_context *ctx, pdf_obj *obj); + +int pdf_array_len(fz_context *ctx, pdf_obj *array); +pdf_obj *pdf_array_get(fz_context *ctx, pdf_obj *array, int i); +void pdf_array_put(fz_context *ctx, pdf_obj *array, int i, pdf_obj *obj); +void pdf_array_put_drop(fz_context *ctx, pdf_obj *array, int i, pdf_obj *obj); +void pdf_array_push(fz_context *ctx, pdf_obj *array, pdf_obj *obj); +void pdf_array_push_drop(fz_context *ctx, pdf_obj *array, pdf_obj *obj); +void pdf_array_insert(fz_context *ctx, pdf_obj *array, pdf_obj *obj, int index); +void pdf_array_insert_drop(fz_context *ctx, pdf_obj *array, pdf_obj *obj, int index); +void pdf_array_delete(fz_context *ctx, pdf_obj *array, int index); +int pdf_array_find(fz_context *ctx, pdf_obj *array, pdf_obj *obj); +int pdf_array_contains(fz_context *ctx, pdf_obj *array, pdf_obj *obj); + +int pdf_dict_len(fz_context *ctx, pdf_obj *dict); +pdf_obj *pdf_dict_get_key(fz_context *ctx, pdf_obj *dict, int idx); +pdf_obj *pdf_dict_get_val(fz_context *ctx, pdf_obj *dict, int idx); +void pdf_dict_put_val_null(fz_context *ctx, pdf_obj *obj, int idx); +pdf_obj *pdf_dict_get(fz_context *ctx, pdf_obj *dict, pdf_obj *key); +pdf_obj *pdf_dict_getp(fz_context *ctx, pdf_obj *dict, const char *path); +pdf_obj *pdf_dict_getl(fz_context *ctx, pdf_obj *dict, ...); +pdf_obj *pdf_dict_geta(fz_context *ctx, pdf_obj *dict, pdf_obj *key, pdf_obj *abbrev); +pdf_obj *pdf_dict_gets(fz_context *ctx, pdf_obj *dict, const char *key); +pdf_obj *pdf_dict_getsa(fz_context *ctx, pdf_obj *dict, const char *key, const char *abbrev); +pdf_obj *pdf_dict_get_inheritable(fz_context *ctx, pdf_obj *dict, pdf_obj *key); +pdf_obj *pdf_dict_getp_inheritable(fz_context *ctx, pdf_obj *dict, const char *path); +pdf_obj *pdf_dict_gets_inheritable(fz_context *ctx, pdf_obj *dict, const char *key); +void pdf_dict_put(fz_context *ctx, pdf_obj *dict, pdf_obj *key, pdf_obj *val); +void pdf_dict_put_drop(fz_context *ctx, pdf_obj *dict, pdf_obj *key, pdf_obj *val); +void pdf_dict_get_put_drop(fz_context *ctx, pdf_obj *dict, pdf_obj *key, pdf_obj *val, pdf_obj **old_val); +void pdf_dict_puts(fz_context *ctx, pdf_obj *dict, const char *key, pdf_obj *val); +void pdf_dict_puts_drop(fz_context *ctx, pdf_obj *dict, const char *key, pdf_obj *val); +void pdf_dict_putp(fz_context *ctx, pdf_obj *dict, const char *path, pdf_obj *val); +void pdf_dict_putp_drop(fz_context *ctx, pdf_obj *dict, const char *path, pdf_obj *val); +void pdf_dict_putl(fz_context *ctx, pdf_obj *dict, pdf_obj *val, ...); +void pdf_dict_putl_drop(fz_context *ctx, pdf_obj *dict, pdf_obj *val, ...); +void pdf_dict_del(fz_context *ctx, pdf_obj *dict, pdf_obj *key); +void pdf_dict_dels(fz_context *ctx, pdf_obj *dict, const char *key); +void pdf_sort_dict(fz_context *ctx, pdf_obj *dict); + +void pdf_dict_put_bool(fz_context *ctx, pdf_obj *dict, pdf_obj *key, int x); +void pdf_dict_put_int(fz_context *ctx, pdf_obj *dict, pdf_obj *key, int64_t x); +void pdf_dict_put_real(fz_context *ctx, pdf_obj *dict, pdf_obj *key, double x); +void pdf_dict_put_name(fz_context *ctx, pdf_obj *dict, pdf_obj *key, const char *x); +void pdf_dict_put_string(fz_context *ctx, pdf_obj *dict, pdf_obj *key, const char *x, size_t n); +void pdf_dict_put_text_string(fz_context *ctx, pdf_obj *dict, pdf_obj *key, const char *x); +void pdf_dict_put_rect(fz_context *ctx, pdf_obj *dict, pdf_obj *key, fz_rect x); +void pdf_dict_put_matrix(fz_context *ctx, pdf_obj *dict, pdf_obj *key, fz_matrix x); +void pdf_dict_put_date(fz_context *ctx, pdf_obj *dict, pdf_obj *key, int64_t time); +pdf_obj *pdf_dict_put_array(fz_context *ctx, pdf_obj *dict, pdf_obj *key, int initial); +pdf_obj *pdf_dict_put_dict(fz_context *ctx, pdf_obj *dict, pdf_obj *key, int initial); +pdf_obj *pdf_dict_puts_dict(fz_context *ctx, pdf_obj *dict, const char *key, int initial); + +int pdf_dict_get_bool(fz_context *ctx, pdf_obj *dict, pdf_obj *key); +int pdf_dict_get_int(fz_context *ctx, pdf_obj *dict, pdf_obj *key); +int64_t pdf_dict_get_int64(fz_context *ctx, pdf_obj *dict, pdf_obj *key); +float pdf_dict_get_real(fz_context *ctx, pdf_obj *dict, pdf_obj *key); +const char *pdf_dict_get_name(fz_context *ctx, pdf_obj *dict, pdf_obj *key); +const char *pdf_dict_get_string(fz_context *ctx, pdf_obj *dict, pdf_obj *key, size_t *sizep); +const char *pdf_dict_get_text_string(fz_context *ctx, pdf_obj *dict, pdf_obj *key); +fz_rect pdf_dict_get_rect(fz_context *ctx, pdf_obj *dict, pdf_obj *key); +fz_matrix pdf_dict_get_matrix(fz_context *ctx, pdf_obj *dict, pdf_obj *key); +int64_t pdf_dict_get_date(fz_context *ctx, pdf_obj *dict, pdf_obj *key); + +void pdf_array_push_bool(fz_context *ctx, pdf_obj *array, int x); +void pdf_array_push_int(fz_context *ctx, pdf_obj *array, int64_t x); +void pdf_array_push_real(fz_context *ctx, pdf_obj *array, double x); +void pdf_array_push_name(fz_context *ctx, pdf_obj *array, const char *x); +void pdf_array_push_string(fz_context *ctx, pdf_obj *array, const char *x, size_t n); +void pdf_array_push_text_string(fz_context *ctx, pdf_obj *array, const char *x); +pdf_obj *pdf_array_push_array(fz_context *ctx, pdf_obj *array, int initial); +pdf_obj *pdf_array_push_dict(fz_context *ctx, pdf_obj *array, int initial); + +int pdf_array_get_bool(fz_context *ctx, pdf_obj *array, int index); +int pdf_array_get_int(fz_context *ctx, pdf_obj *array, int index); +float pdf_array_get_real(fz_context *ctx, pdf_obj *array, int index); +const char *pdf_array_get_name(fz_context *ctx, pdf_obj *array, int index); +const char *pdf_array_get_string(fz_context *ctx, pdf_obj *array, int index, size_t *sizep); +const char *pdf_array_get_text_string(fz_context *ctx, pdf_obj *array, int index); +fz_rect pdf_array_get_rect(fz_context *ctx, pdf_obj *array, int index); +fz_matrix pdf_array_get_matrix(fz_context *ctx, pdf_obj *array, int index); + +void pdf_set_obj_parent(fz_context *ctx, pdf_obj *obj, int num); + +int pdf_obj_refs(fz_context *ctx, pdf_obj *ref); + +int pdf_obj_parent_num(fz_context *ctx, pdf_obj *obj); + +char *pdf_sprint_obj(fz_context *ctx, char *buf, size_t cap, size_t *len, pdf_obj *obj, int tight, int ascii); +void pdf_print_obj(fz_context *ctx, fz_output *out, pdf_obj *obj, int tight, int ascii); +void pdf_print_encrypted_obj(fz_context *ctx, fz_output *out, pdf_obj *obj, int tight, int ascii, pdf_crypt *crypt, int num, int gen); + +void pdf_debug_obj(fz_context *ctx, pdf_obj *obj); +void pdf_debug_ref(fz_context *ctx, pdf_obj *obj); + +/* + Convert Unicode/PdfDocEncoding string into utf-8. + + The returned string must be freed by the caller. +*/ +char *pdf_new_utf8_from_pdf_string(fz_context *ctx, const char *srcptr, size_t srclen); + +/* + Convert text string object to UTF-8. + + The returned string must be freed by the caller. +*/ +char *pdf_new_utf8_from_pdf_string_obj(fz_context *ctx, pdf_obj *src); + +/* + Load text stream and convert to UTF-8. + + The returned string must be freed by the caller. +*/ +char *pdf_new_utf8_from_pdf_stream_obj(fz_context *ctx, pdf_obj *src); + +/* + Load text stream or text string and convert to UTF-8. + + The returned string must be freed by the caller. +*/ +char *pdf_load_stream_or_string_as_utf8(fz_context *ctx, pdf_obj *src); + +fz_quad pdf_to_quad(fz_context *ctx, pdf_obj *array, int offset); +fz_rect pdf_to_rect(fz_context *ctx, pdf_obj *array); +fz_matrix pdf_to_matrix(fz_context *ctx, pdf_obj *array); +int64_t pdf_to_date(fz_context *ctx, pdf_obj *time); + +/* + pdf_get_indirect_document and pdf_get_bound_document are + now deprecated. Please do not use them in future. They will + be removed. + + Please use pdf_pin_document instead. +*/ +pdf_document *pdf_get_indirect_document(fz_context *ctx, pdf_obj *obj); +pdf_document *pdf_get_bound_document(fz_context *ctx, pdf_obj *obj); + +/* + pdf_pin_document returns a new reference to the document + to which obj is bound. The caller is responsible for + dropping this reference once they have finished with it. + + This is a replacement for pdf_get_indirect_document + and pdf_get_bound_document that are now deprecated. Those + returned a borrowed reference that did not need to be + dropped. + + Note that this can validly return NULL in various cases: + 1) When the object is of a simple type (such as a number + or a string), it contains no reference to the enclosing + document. 2) When the object has yet to be inserted into + a PDF document (such as during parsing). 3) And (in + future versions) when the document has been destroyed + but the object reference remains. + + It is the caller's responsibility to deal with a NULL + return here. +*/ +pdf_document *pdf_pin_document(fz_context *ctx, pdf_obj *obj); + +void pdf_set_int(fz_context *ctx, pdf_obj *obj, int64_t i); + +/* Voodoo to create PDF_NAME(Foo) macros from name-table.h */ + +#define PDF_NAME(X) ((pdf_obj*)(intptr_t)PDF_ENUM_NAME_##X) + +#define PDF_MAKE_NAME(STRING,NAME) PDF_ENUM_NAME_##NAME, +enum { + PDF_ENUM_NULL, + PDF_ENUM_TRUE, + PDF_ENUM_FALSE, +#include "mupdf/pdf/name-table.h" + PDF_ENUM_LIMIT, +}; +#undef PDF_MAKE_NAME + +#define PDF_NULL ((pdf_obj*)(intptr_t)PDF_ENUM_NULL) +#define PDF_TRUE ((pdf_obj*)(intptr_t)PDF_ENUM_TRUE) +#define PDF_FALSE ((pdf_obj*)(intptr_t)PDF_ENUM_FALSE) +#define PDF_LIMIT ((pdf_obj*)(intptr_t)PDF_ENUM_LIMIT) + + +/* Implementation details: subject to change. */ + +/* + for use by pdf_crypt_obj_imp to decrypt AES string in place +*/ +void pdf_set_str_len(fz_context *ctx, pdf_obj *obj, size_t newlen); + + +/* Journalling */ + +/* Call this to enable journalling on a given document. */ +void pdf_enable_journal(fz_context *ctx, pdf_document *doc); + +/* Call this to start an operation. Undo/redo works at 'operation' + * granularity. Nested operations are all counted within the outermost + * operation. Any modification performed on a journalled PDF without an + * operation having been started will throw an error. */ +void pdf_begin_operation(fz_context *ctx, pdf_document *doc, const char *operation); + +/* Call this to start an implicit operation. Implicit operations are + * operations that happen as a consequence of things like updating + * an annotation. They get rolled into the previous operation, because + * they generally happen as a result of them. */ +void pdf_begin_implicit_operation(fz_context *ctx, pdf_document *doc); + +/* Call this to end an operation. */ +void pdf_end_operation(fz_context *ctx, pdf_document *doc); + +/* Call this to abandon an operation. Revert to the state + * when you began. */ +void pdf_abandon_operation(fz_context *ctx, pdf_document *doc); + +/* Call this to find out how many undo/redo steps there are, and the + * current position we are within those. 0 = original document, + * *steps = final edited version. */ +int pdf_undoredo_state(fz_context *ctx, pdf_document *doc, int *steps); + +/* Call this to find the title of the operation within the undo state. */ +const char *pdf_undoredo_step(fz_context *ctx, pdf_document *doc, int step); + +/* Helper functions to identify if we are in a state to be able to undo + * or redo. */ +int pdf_can_undo(fz_context *ctx, pdf_document *doc); +int pdf_can_redo(fz_context *ctx, pdf_document *doc); + +/* Move backwards in the undo history. Throws an error if we are at the + * start. Any edits to the document at this point will discard all + * subsequent history. */ +void pdf_undo(fz_context *ctx, pdf_document *doc); + +/* Move forwards in the undo history. Throws an error if we are at the + * end. */ +void pdf_redo(fz_context *ctx, pdf_document *doc); + +/* Called to reset the entire history. This is called implicitly when + * a non-undoable change occurs (such as a pdf repair). */ +void pdf_discard_journal(fz_context *ctx, pdf_journal *journal); + +/* Internal destructor. */ +void pdf_drop_journal(fz_context *ctx, pdf_journal *journal); + +/* Internal call as part of saving a snapshot of a PDF document. */ +void pdf_serialise_journal(fz_context *ctx, pdf_document *doc, fz_output *out); + +/* Internal call as part of loading a snapshot of a PDF document. */ +void pdf_deserialise_journal(fz_context *ctx, pdf_document *doc, fz_stream *stm); + +/* Internal call as part of creating objects. */ +void pdf_add_journal_fragment(fz_context *ctx, pdf_document *doc, int parent, pdf_obj *copy, fz_buffer *copy_stream, int newobj); + +char *pdf_format_date(fz_context *ctx, int64_t time, char *s, size_t n); +int64_t pdf_parse_date(fz_context *ctx, const char *s); + +#endif diff --git a/include/mupdf/pdf/page.h b/include/mupdf/pdf/page.h new file mode 100644 index 0000000..f20bf49 --- /dev/null +++ b/include/mupdf/pdf/page.h @@ -0,0 +1,200 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_PDF_PAGE_H +#define MUPDF_PDF_PAGE_H + +#include "mupdf/pdf/interpret.h" + +int pdf_lookup_page_number(fz_context *ctx, pdf_document *doc, pdf_obj *pageobj); +int pdf_count_pages(fz_context *ctx, pdf_document *doc); +int pdf_count_pages_imp(fz_context *ctx, fz_document *doc, int chapter); +pdf_obj *pdf_lookup_page_obj(fz_context *ctx, pdf_document *doc, int needle); + +/* + Cache the page tree for fast forward/reverse page lookups. + + No longer required. This is a No Op, now as page tree + maps are loaded automatically 'just in time'. +*/ +void pdf_load_page_tree(fz_context *ctx, pdf_document *doc); + +/* + Discard the page tree maps. + + No longer required. This is a No Op, now as page tree + maps are discarded automatically 'just in time'. +*/ +void pdf_drop_page_tree(fz_context *ctx, pdf_document *doc); + +void pdf_drop_page_tree_internal(fz_context *ctx, pdf_document *doc); + + +/* + Make page self sufficient. + + Copy any inheritable page keys into the actual page object, removing + any dependencies on the page tree parents. +*/ +void pdf_flatten_inheritable_page_items(fz_context *ctx, pdf_obj *page); + +/* + Load a page and its resources. + + Locates the page in the PDF document and loads the page and its + resources. After pdf_load_page is it possible to retrieve the size + of the page using pdf_bound_page, or to render the page using + pdf_run_page_*. + + number: page number, where 0 is the first page of the document. +*/ +pdf_page *pdf_load_page(fz_context *ctx, pdf_document *doc, int number); +fz_page *pdf_load_page_imp(fz_context *ctx, fz_document *doc, int chapter, int number); +int pdf_page_has_transparency(fz_context *ctx, pdf_page *page); + +void pdf_page_obj_transform(fz_context *ctx, pdf_obj *pageobj, fz_rect *page_mediabox, fz_matrix *page_ctm); +void pdf_page_transform(fz_context *ctx, pdf_page *page, fz_rect *mediabox, fz_matrix *ctm); +void pdf_page_obj_transform_box(fz_context *ctx, pdf_obj *pageobj, fz_rect *page_mediabox, fz_matrix *page_ctm, fz_box_type box); +void pdf_page_transform_box(fz_context *ctx, pdf_page *page, fz_rect *mediabox, fz_matrix *ctm, fz_box_type box); +pdf_obj *pdf_page_resources(fz_context *ctx, pdf_page *page); +pdf_obj *pdf_page_contents(fz_context *ctx, pdf_page *page); +pdf_obj *pdf_page_group(fz_context *ctx, pdf_page *page); + +/* + Get the separation details for a page. +*/ +fz_separations *pdf_page_separations(fz_context *ctx, pdf_page *page); + +pdf_ocg_descriptor *pdf_read_ocg(fz_context *ctx, pdf_document *doc); +void pdf_drop_ocg(fz_context *ctx, pdf_document *doc); +int pdf_is_ocg_hidden(fz_context *ctx, pdf_document *doc, pdf_obj *rdb, const char *usage, pdf_obj *ocg); + +fz_link *pdf_load_links(fz_context *ctx, pdf_page *page); + +/* + Determine the size of a page. + + Determine the page size in points, taking page rotation + into account. The page size is taken to be the crop box if it + exists (visible area after cropping), otherwise the media box will + be used (possibly including printing marks). +*/ +fz_rect pdf_bound_page(fz_context *ctx, pdf_page *page, fz_box_type box); + +/* + Interpret a loaded page and render it on a device. + + page: A page loaded by pdf_load_page. + + dev: Device used for rendering, obtained from fz_new_*_device. + + ctm: A transformation matrix applied to the objects on the page, + e.g. to scale or rotate the page contents as desired. +*/ +void pdf_run_page(fz_context *ctx, pdf_page *page, fz_device *dev, fz_matrix ctm, fz_cookie *cookie); + +/* + Interpret a loaded page and render it on a device. + + page: A page loaded by pdf_load_page. + + dev: Device used for rendering, obtained from fz_new_*_device. + + ctm: A transformation matrix applied to the objects on the page, + e.g. to scale or rotate the page contents as desired. + + usage: The 'usage' for displaying the file (typically + 'View', 'Print' or 'Export'). NULL means 'View'. + + cookie: A pointer to an optional fz_cookie structure that can be used + to track progress, collect errors etc. +*/ +void pdf_run_page_with_usage(fz_context *ctx, pdf_page *page, fz_device *dev, fz_matrix ctm, const char *usage, fz_cookie *cookie); + +/* + Interpret a loaded page and render it on a device. + Just the main page contents without the annotations + + page: A page loaded by pdf_load_page. + + dev: Device used for rendering, obtained from fz_new_*_device. + + ctm: A transformation matrix applied to the objects on the page, + e.g. to scale or rotate the page contents as desired. +*/ +void pdf_run_page_contents(fz_context *ctx, pdf_page *page, fz_device *dev, fz_matrix ctm, fz_cookie *cookie); +void pdf_run_page_annots(fz_context *ctx, pdf_page *page, fz_device *dev, fz_matrix ctm, fz_cookie *cookie); +void pdf_run_page_widgets(fz_context *ctx, pdf_page *page, fz_device *dev, fz_matrix ctm, fz_cookie *cookie); +void pdf_run_page_contents_with_usage(fz_context *ctx, pdf_page *page, fz_device *dev, fz_matrix ctm, const char *usage, fz_cookie *cookie); +void pdf_run_page_annots_with_usage(fz_context *ctx, pdf_page *page, fz_device *dev, fz_matrix ctm, const char *usage, fz_cookie *cookie); +void pdf_run_page_widgets_with_usage(fz_context *ctx, pdf_page *page, fz_device *dev, fz_matrix ctm, const char *usage, fz_cookie *cookie); + +void pdf_filter_page_contents(fz_context *ctx, pdf_document *doc, pdf_page *page, pdf_filter_options *options); +void pdf_filter_annot_contents(fz_context *ctx, pdf_document *doc, pdf_annot *annot, pdf_filter_options *options); + +fz_pixmap *pdf_new_pixmap_from_page_contents_with_usage(fz_context *ctx, pdf_page *page, fz_matrix ctm, fz_colorspace *cs, int alpha, const char *usage, fz_box_type box); +fz_pixmap *pdf_new_pixmap_from_page_with_usage(fz_context *ctx, pdf_page *page, fz_matrix ctm, fz_colorspace *cs, int alpha, const char *usage, fz_box_type box); +fz_pixmap *pdf_new_pixmap_from_page_contents_with_separations_and_usage(fz_context *ctx, pdf_page *page, fz_matrix ctm, fz_colorspace *cs, fz_separations *seps, int alpha, const char *usage, fz_box_type box); +fz_pixmap *pdf_new_pixmap_from_page_with_separations_and_usage(fz_context *ctx, pdf_page *page, fz_matrix ctm, fz_colorspace *cs, fz_separations *seps, int alpha, const char *usage, fz_box_type box); + +enum { + PDF_REDACT_IMAGE_NONE, + PDF_REDACT_IMAGE_REMOVE, + PDF_REDACT_IMAGE_PIXELS, +}; + +typedef struct +{ + int black_boxes; + int image_method; +} pdf_redact_options; + +int pdf_redact_page(fz_context *ctx, pdf_document *doc, pdf_page *page, pdf_redact_options *opts); + +fz_transition *pdf_page_presentation(fz_context *ctx, pdf_page *page, fz_transition *transition, float *duration); + +fz_default_colorspaces *pdf_load_default_colorspaces(fz_context *ctx, pdf_document *doc, pdf_page *page); + +/* + Update default colorspaces for an xobject. +*/ +fz_default_colorspaces *pdf_update_default_colorspaces(fz_context *ctx, fz_default_colorspaces *old_cs, pdf_obj *res); + +/* + * Page tree, pages and related objects + */ + +struct pdf_page +{ + fz_page super; + pdf_document *doc; /* type alias for super.doc */ + pdf_obj *obj; + + int transparency; + int overprint; + + fz_link *links; + pdf_annot *annots, **annot_tailp; + pdf_annot *widgets, **widget_tailp; +}; + +#endif diff --git a/include/mupdf/pdf/parse.h b/include/mupdf/pdf/parse.h new file mode 100644 index 0000000..3eb9205 --- /dev/null +++ b/include/mupdf/pdf/parse.h @@ -0,0 +1,61 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_PDF_PARSE_H +#define MUPDF_PDF_PARSE_H + +#include "mupdf/pdf/document.h" + +typedef enum +{ + PDF_TOK_ERROR, PDF_TOK_EOF, + PDF_TOK_OPEN_ARRAY, PDF_TOK_CLOSE_ARRAY, + PDF_TOK_OPEN_DICT, PDF_TOK_CLOSE_DICT, + PDF_TOK_OPEN_BRACE, PDF_TOK_CLOSE_BRACE, + PDF_TOK_NAME, PDF_TOK_INT, PDF_TOK_REAL, PDF_TOK_STRING, PDF_TOK_KEYWORD, + PDF_TOK_R, PDF_TOK_TRUE, PDF_TOK_FALSE, PDF_TOK_NULL, + PDF_TOK_OBJ, PDF_TOK_ENDOBJ, + PDF_TOK_STREAM, PDF_TOK_ENDSTREAM, + PDF_TOK_XREF, PDF_TOK_TRAILER, PDF_TOK_STARTXREF, + PDF_TOK_NEWOBJ, + PDF_NUM_TOKENS +} pdf_token; + +void pdf_lexbuf_init(fz_context *ctx, pdf_lexbuf *lexbuf, int size); +void pdf_lexbuf_fin(fz_context *ctx, pdf_lexbuf *lexbuf); +ptrdiff_t pdf_lexbuf_grow(fz_context *ctx, pdf_lexbuf *lexbuf); + +pdf_token pdf_lex(fz_context *ctx, fz_stream *f, pdf_lexbuf *lexbuf); +pdf_token pdf_lex_no_string(fz_context *ctx, fz_stream *f, pdf_lexbuf *lexbuf); + +pdf_obj *pdf_parse_array(fz_context *ctx, pdf_document *doc, fz_stream *f, pdf_lexbuf *buf); +pdf_obj *pdf_parse_dict(fz_context *ctx, pdf_document *doc, fz_stream *f, pdf_lexbuf *buf); +pdf_obj *pdf_parse_stm_obj(fz_context *ctx, pdf_document *doc, fz_stream *f, pdf_lexbuf *buf); +pdf_obj *pdf_parse_ind_obj(fz_context *ctx, pdf_document *doc, fz_stream *f, int *num, int *gen, int64_t *stm_ofs, int *try_repair); +pdf_obj *pdf_parse_journal_obj(fz_context *ctx, pdf_document *doc, fz_stream *stm, int *onum, fz_buffer **ostm, int *newobj); + +/* + print a lexed token to a buffer, growing if necessary +*/ +void pdf_append_token(fz_context *ctx, fz_buffer *buf, int tok, pdf_lexbuf *lex); + +#endif diff --git a/include/mupdf/pdf/resource.h b/include/mupdf/pdf/resource.h new file mode 100644 index 0000000..dee9748 --- /dev/null +++ b/include/mupdf/pdf/resource.h @@ -0,0 +1,133 @@ +// Copyright (C) 2004-2021 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_PDF_RESOURCE_H +#define MUPDF_PDF_RESOURCE_H + +#include "mupdf/fitz/font.h" +#include "mupdf/fitz/image.h" +#include "mupdf/fitz/shade.h" +#include "mupdf/fitz/store.h" +#include "mupdf/pdf/object.h" + +void pdf_store_item(fz_context *ctx, pdf_obj *key, void *val, size_t itemsize); +void *pdf_find_item(fz_context *ctx, fz_store_drop_fn *drop, pdf_obj *key); +void pdf_remove_item(fz_context *ctx, fz_store_drop_fn *drop, pdf_obj *key); +void pdf_empty_store(fz_context *ctx, pdf_document *doc); +void pdf_purge_locals_from_store(fz_context *ctx, pdf_document *doc); + +/* + * Structures used for managing resource locations and avoiding multiple + * occurrences when resources are added to the document. The search for existing + * resources will be performed when we are first trying to add an item. Object + * refs are stored in a fz_hash_table structure using a hash of the md5 sum of + * the data, enabling rapid lookup. + */ + +enum { PDF_SIMPLE_FONT_RESOURCE=1, PDF_CID_FONT_RESOURCE=2, PDF_CJK_FONT_RESOURCE=3 }; +enum { PDF_SIMPLE_ENCODING_LATIN, PDF_SIMPLE_ENCODING_GREEK, PDF_SIMPLE_ENCODING_CYRILLIC }; + +/* The contents of this structure are defined publically just so we can + * define this on the stack. */ +typedef struct +{ + unsigned char digest[16]; + int type; + int encoding; + int local_xref; +} pdf_font_resource_key; + +pdf_obj *pdf_find_font_resource(fz_context *ctx, pdf_document *doc, int type, int encoding, fz_font *item, pdf_font_resource_key *key); +pdf_obj *pdf_insert_font_resource(fz_context *ctx, pdf_document *doc, pdf_font_resource_key *key, pdf_obj *obj); +void pdf_drop_resource_tables(fz_context *ctx, pdf_document *doc); +void pdf_purge_local_font_resources(fz_context *ctx, pdf_document *doc); + +typedef struct pdf_function pdf_function; + +void pdf_eval_function(fz_context *ctx, pdf_function *func, const float *in, int inlen, float *out, int outlen); +pdf_function *pdf_keep_function(fz_context *ctx, pdf_function *func); +void pdf_drop_function(fz_context *ctx, pdf_function *func); +size_t pdf_function_size(fz_context *ctx, pdf_function *func); +pdf_function *pdf_load_function(fz_context *ctx, pdf_obj *ref, int in, int out); + +fz_colorspace *pdf_document_output_intent(fz_context *ctx, pdf_document *doc); +fz_colorspace *pdf_load_colorspace(fz_context *ctx, pdf_obj *obj); +int pdf_is_tint_colorspace(fz_context *ctx, fz_colorspace *cs); + +fz_shade *pdf_load_shading(fz_context *ctx, pdf_document *doc, pdf_obj *obj); +void pdf_sample_shade_function(fz_context *ctx, float shade[256][FZ_MAX_COLORS+1], int n, int funcs, pdf_function **func, float t0, float t1); + +/** + Function to recolor a single color from a shade. +*/ +typedef void (pdf_recolor_vertex)(fz_context *ctx, void *opaque, fz_colorspace *dst_cs, float *d, fz_colorspace *src_cs, const float *src); + +/** + Function to handle recoloring a shade. + + Called with src_cs from the shade. If no recoloring is required, return NULL. Otherwise + fill in *dst_cs, and return a vertex recolorer. +*/ +typedef pdf_recolor_vertex *(pdf_shade_recolorer)(fz_context *ctx, void *opaque, fz_colorspace *src_cs, fz_colorspace **dst_cs); + +/** + Recolor a shade. +*/ +pdf_obj *pdf_recolor_shade(fz_context *ctx, pdf_obj *shade, pdf_shade_recolorer *reshade, void *opaque); + +fz_image *pdf_load_inline_image(fz_context *ctx, pdf_document *doc, pdf_obj *rdb, pdf_obj *dict, fz_stream *file); +int pdf_is_jpx_image(fz_context *ctx, pdf_obj *dict); + +fz_image *pdf_load_image(fz_context *ctx, pdf_document *doc, pdf_obj *obj); + +pdf_obj *pdf_add_image(fz_context *ctx, pdf_document *doc, fz_image *image); + +typedef struct +{ + fz_storable storable; + int ismask; + float xstep; + float ystep; + fz_matrix matrix; + fz_rect bbox; + pdf_document *document; + pdf_obj *resources; + pdf_obj *contents; + int id; /* unique ID for caching rendered tiles */ +} pdf_pattern; + +pdf_pattern *pdf_load_pattern(fz_context *ctx, pdf_document *doc, pdf_obj *obj); +pdf_pattern *pdf_keep_pattern(fz_context *ctx, pdf_pattern *pat); +void pdf_drop_pattern(fz_context *ctx, pdf_pattern *pat); + +pdf_obj *pdf_new_xobject(fz_context *ctx, pdf_document *doc, fz_rect bbox, fz_matrix matrix, pdf_obj *res, fz_buffer *buffer); +void pdf_update_xobject(fz_context *ctx, pdf_document *doc, pdf_obj *xobj, fz_rect bbox, fz_matrix mat, pdf_obj *res, fz_buffer *buffer); + +pdf_obj *pdf_xobject_resources(fz_context *ctx, pdf_obj *xobj); +fz_rect pdf_xobject_bbox(fz_context *ctx, pdf_obj *xobj); +fz_matrix pdf_xobject_matrix(fz_context *ctx, pdf_obj *xobj); +int pdf_xobject_isolated(fz_context *ctx, pdf_obj *xobj); +int pdf_xobject_knockout(fz_context *ctx, pdf_obj *xobj); +int pdf_xobject_transparency(fz_context *ctx, pdf_obj *xobj); +fz_colorspace *pdf_xobject_colorspace(fz_context *ctx, pdf_obj *xobj); + +#endif diff --git a/include/mupdf/pdf/xref.h b/include/mupdf/pdf/xref.h new file mode 100644 index 0000000..ad7bba2 --- /dev/null +++ b/include/mupdf/pdf/xref.h @@ -0,0 +1,300 @@ +// Copyright (C) 2004-2022 Artifex Software, Inc. +// +// This file is part of MuPDF. +// +// MuPDF is free software: you can redistribute it and/or modify it under the +// terms of the GNU Affero General Public License as published by the Free +// Software Foundation, either version 3 of the License, or (at your option) +// any later version. +// +// MuPDF is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +// FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more +// details. +// +// You should have received a copy of the GNU Affero General Public License +// along with MuPDF. If not, see +// +// Alternative licensing terms are available from the licensor. +// For commercial licensing, see or contact +// Artifex Software, Inc., 39 Mesa Street, Suite 108A, San Francisco, +// CA 94129, USA, for further information. + +#ifndef MUPDF_PDF_XREF_H +#define MUPDF_PDF_XREF_H + +#include "mupdf/pdf/document.h" + +/* + Allocate a slot in the xref table and return a fresh unused object number. +*/ +int pdf_create_object(fz_context *ctx, pdf_document *doc); + +/* + Remove object from xref table, marking the slot as free. +*/ +void pdf_delete_object(fz_context *ctx, pdf_document *doc, int num); + +/* + Replace object in xref table with the passed in object. +*/ +void pdf_update_object(fz_context *ctx, pdf_document *doc, int num, pdf_obj *obj); + +/* + Replace stream contents for object in xref table with the passed in buffer. + + The buffer contents must match the /Filter setting if 'compressed' is true. + If 'compressed' is false, the /Filter and /DecodeParms entries are deleted. + The /Length entry is updated. +*/ +void pdf_update_stream(fz_context *ctx, pdf_document *doc, pdf_obj *ref, fz_buffer *buf, int compressed); + +/* + Return true if 'obj' is an indirect reference to an object that is held + by the "local" xref section. +*/ +int pdf_is_local_object(fz_context *ctx, pdf_document *doc, pdf_obj *obj); + +pdf_obj *pdf_add_object(fz_context *ctx, pdf_document *doc, pdf_obj *obj); +pdf_obj *pdf_add_object_drop(fz_context *ctx, pdf_document *doc, pdf_obj *obj); +pdf_obj *pdf_add_stream(fz_context *ctx, pdf_document *doc, fz_buffer *buf, pdf_obj *obj, int compressed); + +pdf_obj *pdf_add_new_dict(fz_context *ctx, pdf_document *doc, int initial); +pdf_obj *pdf_add_new_array(fz_context *ctx, pdf_document *doc, int initial); + +typedef struct +{ + char type; /* 0=unset (f)ree i(n)use (o)bjstm */ + unsigned char marked; /* marked to keep alive with pdf_mark_xref */ + unsigned short gen; /* generation / objstm index */ + int num; /* original object number (for decryption after renumbering) */ + int64_t ofs; /* file offset / objstm object number */ + int64_t stm_ofs; /* on-disk stream */ + fz_buffer *stm_buf; /* in-memory stream (for updated objects) */ + pdf_obj *obj; /* stored/cached object */ +} pdf_xref_entry; + +typedef struct pdf_xref_subsec +{ + struct pdf_xref_subsec *next; + int len; + int start; + pdf_xref_entry *table; +} pdf_xref_subsec; + +struct pdf_xref +{ + int num_objects; + pdf_xref_subsec *subsec; + pdf_obj *trailer; + pdf_obj *pre_repair_trailer; + pdf_unsaved_sig *unsaved_sigs; + pdf_unsaved_sig **unsaved_sigs_end; + int64_t end_ofs; /* file offset to end of xref */ +}; + +/** + Retrieve the pdf_xref_entry for a given object. + + This can cause xref reorganisations (solidifications etc) due to + repairs, so all held pdf_xref_entries should be considered + invalid after this call (other than the returned one). +*/ +pdf_xref_entry *pdf_cache_object(fz_context *ctx, pdf_document *doc, int num); + +int pdf_count_objects(fz_context *ctx, pdf_document *doc); + +/** + Resolve an indirect object (or chain of objects). + + This can cause xref reorganisations (solidifications etc) due to + repairs, so all held pdf_xref_entries should be considered + invalid after this call (other than the returned one). +*/ +pdf_obj *pdf_resolve_indirect(fz_context *ctx, pdf_obj *ref); +pdf_obj *pdf_resolve_indirect_chain(fz_context *ctx, pdf_obj *ref); + +/** + Load a given object. + + This can cause xref reorganisations (solidifications etc) due to + repairs, so all held pdf_xref_entries should be considered + invalid after this call (other than the returned one). +*/ +pdf_obj *pdf_load_object(fz_context *ctx, pdf_document *doc, int num); +pdf_obj *pdf_load_unencrypted_object(fz_context *ctx, pdf_document *doc, int num); + +/* + Load raw (compressed but decrypted) contents of a stream into buf. +*/ +fz_buffer *pdf_load_raw_stream_number(fz_context *ctx, pdf_document *doc, int num); +fz_buffer *pdf_load_raw_stream(fz_context *ctx, pdf_obj *ref); + +/* + Load uncompressed contents of a stream into buf. +*/ +fz_buffer *pdf_load_stream_number(fz_context *ctx, pdf_document *doc, int num); +fz_buffer *pdf_load_stream(fz_context *ctx, pdf_obj *ref); + +/* + Open a stream for reading the raw (compressed but decrypted) data. +*/ +fz_stream *pdf_open_raw_stream_number(fz_context *ctx, pdf_document *doc, int num); +fz_stream *pdf_open_raw_stream(fz_context *ctx, pdf_obj *ref); + +/* + Open a stream for reading uncompressed data. + Put the opened file in doc->stream. + Using doc->file while a stream is open is a Bad idea. +*/ +fz_stream *pdf_open_stream_number(fz_context *ctx, pdf_document *doc, int num); +fz_stream *pdf_open_stream(fz_context *ctx, pdf_obj *ref); + +/* + Construct a filter to decode a stream, without + constraining to stream length, and without decryption. +*/ +fz_stream *pdf_open_inline_stream(fz_context *ctx, pdf_document *doc, pdf_obj *stmobj, int length, fz_stream *chain, fz_compression_params *params); +fz_compressed_buffer *pdf_load_compressed_stream(fz_context *ctx, pdf_document *doc, int num, size_t worst_case); +void pdf_load_compressed_inline_image(fz_context *ctx, pdf_document *doc, pdf_obj *dict, int length, fz_stream *cstm, int indexed, fz_compressed_image *image); +fz_stream *pdf_open_stream_with_offset(fz_context *ctx, pdf_document *doc, int num, pdf_obj *dict, int64_t stm_ofs); +fz_stream *pdf_open_contents_stream(fz_context *ctx, pdf_document *doc, pdf_obj *obj); + +int pdf_version(fz_context *ctx, pdf_document *doc); +pdf_obj *pdf_trailer(fz_context *ctx, pdf_document *doc); +void pdf_set_populating_xref_trailer(fz_context *ctx, pdf_document *doc, pdf_obj *trailer); +int pdf_xref_len(fz_context *ctx, pdf_document *doc); + +pdf_obj *pdf_metadata(fz_context *ctx, pdf_document *doc); + +/* + Used while reading the individual xref sections from a file. +*/ +pdf_xref_entry *pdf_get_populating_xref_entry(fz_context *ctx, pdf_document *doc, int i); + +/* + Used after loading a document to access entries. + + This will never throw anything, or return NULL if it is + only asked to return objects in range within a 'solid' + xref. + + This may "solidify" the xref (so can cause allocations). +*/ +pdf_xref_entry *pdf_get_xref_entry(fz_context *ctx, pdf_document *doc, int i); + +/* + Map a function across all xref entries in a document. +*/ +void pdf_xref_entry_map(fz_context *ctx, pdf_document *doc, void (*fn)(fz_context *, pdf_xref_entry *, int i, pdf_document *doc, void *), void *arg); + + +/* + Used after loading a document to access entries. + + This will never throw anything, or return NULL if it is + only asked to return objects in range within a 'solid' + xref. + + This will never "solidify" the xref, so no entry may be found + (NULL will be returned) for free entries. + + Called with a valid i, this will never try/catch or throw. +*/ +pdf_xref_entry *pdf_get_xref_entry_no_change(fz_context *ctx, pdf_document *doc, int i); +pdf_xref_entry *pdf_get_xref_entry_no_null(fz_context *ctx, pdf_document *doc, int i); +void pdf_replace_xref(fz_context *ctx, pdf_document *doc, pdf_xref_entry *entries, int n); +void pdf_forget_xref(fz_context *ctx, pdf_document *doc); +pdf_xref_entry *pdf_get_incremental_xref_entry(fz_context *ctx, pdf_document *doc, int i); + +/* + Ensure that an object has been cloned into the incremental xref section. +*/ +int pdf_xref_ensure_incremental_object(fz_context *ctx, pdf_document *doc, int num); +int pdf_xref_is_incremental(fz_context *ctx, pdf_document *doc, int num); +void pdf_xref_store_unsaved_signature(fz_context *ctx, pdf_document *doc, pdf_obj *field, pdf_pkcs7_signer *signer); +void pdf_xref_remove_unsaved_signature(fz_context *ctx, pdf_document *doc, pdf_obj *field); +int pdf_xref_obj_is_unsaved_signature(pdf_document *doc, pdf_obj *obj); +void pdf_xref_ensure_local_object(fz_context *ctx, pdf_document *doc, int num); +int pdf_obj_is_incremental(fz_context *ctx, pdf_obj *obj); + +void pdf_repair_xref(fz_context *ctx, pdf_document *doc); +void pdf_repair_obj_stms(fz_context *ctx, pdf_document *doc); +void pdf_repair_trailer(fz_context *ctx, pdf_document *doc); + +/* + Ensure that the current populating xref has a single subsection + that covers the entire range. +*/ +void pdf_ensure_solid_xref(fz_context *ctx, pdf_document *doc, int num); +void pdf_mark_xref(fz_context *ctx, pdf_document *doc); +void pdf_clear_xref(fz_context *ctx, pdf_document *doc); +void pdf_clear_xref_to_mark(fz_context *ctx, pdf_document *doc); + +int pdf_repair_obj(fz_context *ctx, pdf_document *doc, pdf_lexbuf *buf, int64_t *stmofsp, int64_t *stmlenp, pdf_obj **encrypt, pdf_obj **id, pdf_obj **page, int64_t *tmpofs, pdf_obj **root); + +pdf_obj *pdf_progressive_advance(fz_context *ctx, pdf_document *doc, int pagenum); + +/* + Return the number of versions that there + are in a file. i.e. 1 + the number of updates that + the file on disc has been through. i.e. internal + unsaved changes to the file (such as appearance streams) + are ignored. Also, the initial write of a linearized + file (which appears as a base file write + an incremental + update) is treated as a single version. +*/ +int pdf_count_versions(fz_context *ctx, pdf_document *doc); +int pdf_count_unsaved_versions(fz_context *ctx, pdf_document *doc); +int pdf_validate_changes(fz_context *ctx, pdf_document *doc, int version); +int pdf_doc_was_linearized(fz_context *ctx, pdf_document *doc); + +typedef struct pdf_locked_fields pdf_locked_fields; +int pdf_is_field_locked(fz_context *ctx, pdf_locked_fields *locked, const char *name); +void pdf_drop_locked_fields(fz_context *ctx, pdf_locked_fields *locked); +pdf_locked_fields *pdf_find_locked_fields(fz_context *ctx, pdf_document *doc, int version); +pdf_locked_fields *pdf_find_locked_fields_for_sig(fz_context *ctx, pdf_document *doc, pdf_obj *sig); + +/* + Check the entire history of the document, and return the number of + the last version that checked out OK. + i.e. 0 = "the entire history checks out OK". + n = "none of the history checked out OK". +*/ +int pdf_validate_change_history(fz_context *ctx, pdf_document *doc); + +/* + Find which version of a document the current version of obj + was defined in. + + version = 0 = latest, 1 = previous update etc, allowing for + the first incremental update in a linearized file being ignored. +*/ +int pdf_find_version_for_obj(fz_context *ctx, pdf_document *doc, pdf_obj *obj); + +/* + Return the number of updates ago when a signature became invalid, + not counting any unsaved changes. + + Thus: + -1 => Has changed in the current unsaved changes. + 0 => still valid. + 1 => became invalid on the last save + n => became invalid n saves ago +*/ +int pdf_validate_signature(fz_context *ctx, pdf_annot *widget); +int pdf_was_pure_xfa(fz_context *ctx, pdf_document *doc); + +/* Local xrefs - designed for holding stuff that shouldn't be written + * back into the actual document, such as synthesized appearance + * streams. */ +pdf_xref *pdf_new_local_xref(fz_context *ctx, pdf_document *doc); + +void pdf_drop_local_xref(fz_context *ctx, pdf_xref *xref); +void pdf_drop_local_xref_and_resources(fz_context *ctx, pdf_document *doc); + +/* Debug call to dump the incremental/local xrefs to the + * debug channel. */ +void pdf_debug_doc_changes(fz_context *ctx, pdf_document *doc); + +#endif diff --git a/include/mupdf/ucdn.h b/include/mupdf/ucdn.h new file mode 100644 index 0000000..f03ae69 --- /dev/null +++ b/include/mupdf/ucdn.h @@ -0,0 +1,452 @@ +/* + * Copyright (C) 2012 Grigori Goronzy + * + * Permission to use, copy, modify, and/or distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#ifndef UCDN_H +#define UCDN_H + +#ifdef __cplusplus +extern "C" { +#endif + +#include "fitz/system.h" + +#define UCDN_EAST_ASIAN_F 0 +#define UCDN_EAST_ASIAN_H 1 +#define UCDN_EAST_ASIAN_W 2 +#define UCDN_EAST_ASIAN_NA 3 +#define UCDN_EAST_ASIAN_A 4 +#define UCDN_EAST_ASIAN_N 5 + +#define UCDN_SCRIPT_COMMON 0 +#define UCDN_SCRIPT_LATIN 1 +#define UCDN_SCRIPT_GREEK 2 +#define UCDN_SCRIPT_CYRILLIC 3 +#define UCDN_SCRIPT_ARMENIAN 4 +#define UCDN_SCRIPT_HEBREW 5 +#define UCDN_SCRIPT_ARABIC 6 +#define UCDN_SCRIPT_SYRIAC 7 +#define UCDN_SCRIPT_THAANA 8 +#define UCDN_SCRIPT_DEVANAGARI 9 +#define UCDN_SCRIPT_BENGALI 10 +#define UCDN_SCRIPT_GURMUKHI 11 +#define UCDN_SCRIPT_GUJARATI 12 +#define UCDN_SCRIPT_ORIYA 13 +#define UCDN_SCRIPT_TAMIL 14 +#define UCDN_SCRIPT_TELUGU 15 +#define UCDN_SCRIPT_KANNADA 16 +#define UCDN_SCRIPT_MALAYALAM 17 +#define UCDN_SCRIPT_SINHALA 18 +#define UCDN_SCRIPT_THAI 19 +#define UCDN_SCRIPT_LAO 20 +#define UCDN_SCRIPT_TIBETAN 21 +#define UCDN_SCRIPT_MYANMAR 22 +#define UCDN_SCRIPT_GEORGIAN 23 +#define UCDN_SCRIPT_HANGUL 24 +#define UCDN_SCRIPT_ETHIOPIC 25 +#define UCDN_SCRIPT_CHEROKEE 26 +#define UCDN_SCRIPT_CANADIAN_ABORIGINAL 27 +#define UCDN_SCRIPT_OGHAM 28 +#define UCDN_SCRIPT_RUNIC 29 +#define UCDN_SCRIPT_KHMER 30 +#define UCDN_SCRIPT_MONGOLIAN 31 +#define UCDN_SCRIPT_HIRAGANA 32 +#define UCDN_SCRIPT_KATAKANA 33 +#define UCDN_SCRIPT_BOPOMOFO 34 +#define UCDN_SCRIPT_HAN 35 +#define UCDN_SCRIPT_YI 36 +#define UCDN_SCRIPT_OLD_ITALIC 37 +#define UCDN_SCRIPT_GOTHIC 38 +#define UCDN_SCRIPT_DESERET 39 +#define UCDN_SCRIPT_INHERITED 40 +#define UCDN_SCRIPT_TAGALOG 41 +#define UCDN_SCRIPT_HANUNOO 42 +#define UCDN_SCRIPT_BUHID 43 +#define UCDN_SCRIPT_TAGBANWA 44 +#define UCDN_SCRIPT_LIMBU 45 +#define UCDN_SCRIPT_TAI_LE 46 +#define UCDN_SCRIPT_LINEAR_B 47 +#define UCDN_SCRIPT_UGARITIC 48 +#define UCDN_SCRIPT_SHAVIAN 49 +#define UCDN_SCRIPT_OSMANYA 50 +#define UCDN_SCRIPT_CYPRIOT 51 +#define UCDN_SCRIPT_BRAILLE 52 +#define UCDN_SCRIPT_BUGINESE 53 +#define UCDN_SCRIPT_COPTIC 54 +#define UCDN_SCRIPT_NEW_TAI_LUE 55 +#define UCDN_SCRIPT_GLAGOLITIC 56 +#define UCDN_SCRIPT_TIFINAGH 57 +#define UCDN_SCRIPT_SYLOTI_NAGRI 58 +#define UCDN_SCRIPT_OLD_PERSIAN 59 +#define UCDN_SCRIPT_KHAROSHTHI 60 +#define UCDN_SCRIPT_BALINESE 61 +#define UCDN_SCRIPT_CUNEIFORM 62 +#define UCDN_SCRIPT_PHOENICIAN 63 +#define UCDN_SCRIPT_PHAGS_PA 64 +#define UCDN_SCRIPT_NKO 65 +#define UCDN_SCRIPT_SUNDANESE 66 +#define UCDN_SCRIPT_LEPCHA 67 +#define UCDN_SCRIPT_OL_CHIKI 68 +#define UCDN_SCRIPT_VAI 69 +#define UCDN_SCRIPT_SAURASHTRA 70 +#define UCDN_SCRIPT_KAYAH_LI 71 +#define UCDN_SCRIPT_REJANG 72 +#define UCDN_SCRIPT_LYCIAN 73 +#define UCDN_SCRIPT_CARIAN 74 +#define UCDN_SCRIPT_LYDIAN 75 +#define UCDN_SCRIPT_CHAM 76 +#define UCDN_SCRIPT_TAI_THAM 77 +#define UCDN_SCRIPT_TAI_VIET 78 +#define UCDN_SCRIPT_AVESTAN 79 +#define UCDN_SCRIPT_EGYPTIAN_HIEROGLYPHS 80 +#define UCDN_SCRIPT_SAMARITAN 81 +#define UCDN_SCRIPT_LISU 82 +#define UCDN_SCRIPT_BAMUM 83 +#define UCDN_SCRIPT_JAVANESE 84 +#define UCDN_SCRIPT_MEETEI_MAYEK 85 +#define UCDN_SCRIPT_IMPERIAL_ARAMAIC 86 +#define UCDN_SCRIPT_OLD_SOUTH_ARABIAN 87 +#define UCDN_SCRIPT_INSCRIPTIONAL_PARTHIAN 88 +#define UCDN_SCRIPT_INSCRIPTIONAL_PAHLAVI 89 +#define UCDN_SCRIPT_OLD_TURKIC 90 +#define UCDN_SCRIPT_KAITHI 91 +#define UCDN_SCRIPT_BATAK 92 +#define UCDN_SCRIPT_BRAHMI 93 +#define UCDN_SCRIPT_MANDAIC 94 +#define UCDN_SCRIPT_CHAKMA 95 +#define UCDN_SCRIPT_MEROITIC_CURSIVE 96 +#define UCDN_SCRIPT_MEROITIC_HIEROGLYPHS 97 +#define UCDN_SCRIPT_MIAO 98 +#define UCDN_SCRIPT_SHARADA 99 +#define UCDN_SCRIPT_SORA_SOMPENG 100 +#define UCDN_SCRIPT_TAKRI 101 +#define UCDN_SCRIPT_UNKNOWN 102 +#define UCDN_SCRIPT_BASSA_VAH 103 +#define UCDN_SCRIPT_CAUCASIAN_ALBANIAN 104 +#define UCDN_SCRIPT_DUPLOYAN 105 +#define UCDN_SCRIPT_ELBASAN 106 +#define UCDN_SCRIPT_GRANTHA 107 +#define UCDN_SCRIPT_KHOJKI 108 +#define UCDN_SCRIPT_KHUDAWADI 109 +#define UCDN_SCRIPT_LINEAR_A 110 +#define UCDN_SCRIPT_MAHAJANI 111 +#define UCDN_SCRIPT_MANICHAEAN 112 +#define UCDN_SCRIPT_MENDE_KIKAKUI 113 +#define UCDN_SCRIPT_MODI 114 +#define UCDN_SCRIPT_MRO 115 +#define UCDN_SCRIPT_NABATAEAN 116 +#define UCDN_SCRIPT_OLD_NORTH_ARABIAN 117 +#define UCDN_SCRIPT_OLD_PERMIC 118 +#define UCDN_SCRIPT_PAHAWH_HMONG 119 +#define UCDN_SCRIPT_PALMYRENE 120 +#define UCDN_SCRIPT_PAU_CIN_HAU 121 +#define UCDN_SCRIPT_PSALTER_PAHLAVI 122 +#define UCDN_SCRIPT_SIDDHAM 123 +#define UCDN_SCRIPT_TIRHUTA 124 +#define UCDN_SCRIPT_WARANG_CITI 125 +#define UCDN_SCRIPT_AHOM 126 +#define UCDN_SCRIPT_ANATOLIAN_HIEROGLYPHS 127 +#define UCDN_SCRIPT_HATRAN 128 +#define UCDN_SCRIPT_MULTANI 129 +#define UCDN_SCRIPT_OLD_HUNGARIAN 130 +#define UCDN_SCRIPT_SIGNWRITING 131 +#define UCDN_SCRIPT_ADLAM 132 +#define UCDN_SCRIPT_BHAIKSUKI 133 +#define UCDN_SCRIPT_MARCHEN 134 +#define UCDN_SCRIPT_NEWA 135 +#define UCDN_SCRIPT_OSAGE 136 +#define UCDN_SCRIPT_TANGUT 137 +#define UCDN_SCRIPT_MASARAM_GONDI 138 +#define UCDN_SCRIPT_NUSHU 139 +#define UCDN_SCRIPT_SOYOMBO 140 +#define UCDN_SCRIPT_ZANABAZAR_SQUARE 141 +#define UCDN_SCRIPT_DOGRA 142 +#define UCDN_SCRIPT_GUNJALA_GONDI 143 +#define UCDN_SCRIPT_HANIFI_ROHINGYA 144 +#define UCDN_SCRIPT_MAKASAR 145 +#define UCDN_SCRIPT_MEDEFAIDRIN 146 +#define UCDN_SCRIPT_OLD_SOGDIAN 147 +#define UCDN_SCRIPT_SOGDIAN 148 +#define UCDN_SCRIPT_ELYMAIC 149 +#define UCDN_SCRIPT_NANDINAGARI 150 +#define UCDN_SCRIPT_NYIAKENG_PUACHUE_HMONG 151 +#define UCDN_SCRIPT_WANCHO 152 +#define UCDN_SCRIPT_CHORASMIAN 153 +#define UCDN_SCRIPT_DIVES_AKURU 154 +#define UCDN_SCRIPT_KHITAN_SMALL_SCRIPT 155 +#define UCDN_SCRIPT_YEZIDI 156 +#define UCDN_SCRIPT_VITHKUQI 157 +#define UCDN_SCRIPT_OLD_UYGHUR 158 +#define UCDN_SCRIPT_CYPRO_MINOAN 159 +#define UCDN_SCRIPT_TANGSA 160 +#define UCDN_SCRIPT_TOTO 161 +#define UCDN_SCRIPT_KAWI 162 +#define UCDN_SCRIPT_NAG_MUNDARI 163 +#define UCDN_LAST_SCRIPT 163 + +#define UCDN_LINEBREAK_CLASS_OP 0 +#define UCDN_LINEBREAK_CLASS_CL 1 +#define UCDN_LINEBREAK_CLASS_CP 2 +#define UCDN_LINEBREAK_CLASS_QU 3 +#define UCDN_LINEBREAK_CLASS_GL 4 +#define UCDN_LINEBREAK_CLASS_NS 5 +#define UCDN_LINEBREAK_CLASS_EX 6 +#define UCDN_LINEBREAK_CLASS_SY 7 +#define UCDN_LINEBREAK_CLASS_IS 8 +#define UCDN_LINEBREAK_CLASS_PR 9 +#define UCDN_LINEBREAK_CLASS_PO 10 +#define UCDN_LINEBREAK_CLASS_NU 11 +#define UCDN_LINEBREAK_CLASS_AL 12 +#define UCDN_LINEBREAK_CLASS_HL 13 +#define UCDN_LINEBREAK_CLASS_ID 14 +#define UCDN_LINEBREAK_CLASS_IN 15 +#define UCDN_LINEBREAK_CLASS_HY 16 +#define UCDN_LINEBREAK_CLASS_BA 17 +#define UCDN_LINEBREAK_CLASS_BB 18 +#define UCDN_LINEBREAK_CLASS_B2 19 +#define UCDN_LINEBREAK_CLASS_ZW 20 +#define UCDN_LINEBREAK_CLASS_CM 21 +#define UCDN_LINEBREAK_CLASS_WJ 22 +#define UCDN_LINEBREAK_CLASS_H2 23 +#define UCDN_LINEBREAK_CLASS_H3 24 +#define UCDN_LINEBREAK_CLASS_JL 25 +#define UCDN_LINEBREAK_CLASS_JV 26 +#define UCDN_LINEBREAK_CLASS_JT 27 +#define UCDN_LINEBREAK_CLASS_RI 28 +#define UCDN_LINEBREAK_CLASS_AI 29 +#define UCDN_LINEBREAK_CLASS_BK 30 +#define UCDN_LINEBREAK_CLASS_CB 31 +#define UCDN_LINEBREAK_CLASS_CJ 32 +#define UCDN_LINEBREAK_CLASS_CR 33 +#define UCDN_LINEBREAK_CLASS_LF 34 +#define UCDN_LINEBREAK_CLASS_NL 35 +#define UCDN_LINEBREAK_CLASS_SA 36 +#define UCDN_LINEBREAK_CLASS_SG 37 +#define UCDN_LINEBREAK_CLASS_SP 38 +#define UCDN_LINEBREAK_CLASS_XX 39 +#define UCDN_LINEBREAK_CLASS_ZWJ 40 +#define UCDN_LINEBREAK_CLASS_EB 41 +#define UCDN_LINEBREAK_CLASS_EM 42 + +#define UCDN_GENERAL_CATEGORY_CC 0 +#define UCDN_GENERAL_CATEGORY_CF 1 +#define UCDN_GENERAL_CATEGORY_CN 2 +#define UCDN_GENERAL_CATEGORY_CO 3 +#define UCDN_GENERAL_CATEGORY_CS 4 +#define UCDN_GENERAL_CATEGORY_LL 5 +#define UCDN_GENERAL_CATEGORY_LM 6 +#define UCDN_GENERAL_CATEGORY_LO 7 +#define UCDN_GENERAL_CATEGORY_LT 8 +#define UCDN_GENERAL_CATEGORY_LU 9 +#define UCDN_GENERAL_CATEGORY_MC 10 +#define UCDN_GENERAL_CATEGORY_ME 11 +#define UCDN_GENERAL_CATEGORY_MN 12 +#define UCDN_GENERAL_CATEGORY_ND 13 +#define UCDN_GENERAL_CATEGORY_NL 14 +#define UCDN_GENERAL_CATEGORY_NO 15 +#define UCDN_GENERAL_CATEGORY_PC 16 +#define UCDN_GENERAL_CATEGORY_PD 17 +#define UCDN_GENERAL_CATEGORY_PE 18 +#define UCDN_GENERAL_CATEGORY_PF 19 +#define UCDN_GENERAL_CATEGORY_PI 20 +#define UCDN_GENERAL_CATEGORY_PO 21 +#define UCDN_GENERAL_CATEGORY_PS 22 +#define UCDN_GENERAL_CATEGORY_SC 23 +#define UCDN_GENERAL_CATEGORY_SK 24 +#define UCDN_GENERAL_CATEGORY_SM 25 +#define UCDN_GENERAL_CATEGORY_SO 26 +#define UCDN_GENERAL_CATEGORY_ZL 27 +#define UCDN_GENERAL_CATEGORY_ZP 28 +#define UCDN_GENERAL_CATEGORY_ZS 29 + +#define UCDN_BIDI_CLASS_L 0 +#define UCDN_BIDI_CLASS_LRE 1 +#define UCDN_BIDI_CLASS_LRO 2 +#define UCDN_BIDI_CLASS_R 3 +#define UCDN_BIDI_CLASS_AL 4 +#define UCDN_BIDI_CLASS_RLE 5 +#define UCDN_BIDI_CLASS_RLO 6 +#define UCDN_BIDI_CLASS_PDF 7 +#define UCDN_BIDI_CLASS_EN 8 +#define UCDN_BIDI_CLASS_ES 9 +#define UCDN_BIDI_CLASS_ET 10 +#define UCDN_BIDI_CLASS_AN 11 +#define UCDN_BIDI_CLASS_CS 12 +#define UCDN_BIDI_CLASS_NSM 13 +#define UCDN_BIDI_CLASS_BN 14 +#define UCDN_BIDI_CLASS_B 15 +#define UCDN_BIDI_CLASS_S 16 +#define UCDN_BIDI_CLASS_WS 17 +#define UCDN_BIDI_CLASS_ON 18 +#define UCDN_BIDI_CLASS_LRI 19 +#define UCDN_BIDI_CLASS_RLI 20 +#define UCDN_BIDI_CLASS_FSI 21 +#define UCDN_BIDI_CLASS_PDI 22 + +#define UCDN_BIDI_PAIRED_BRACKET_TYPE_OPEN 0 +#define UCDN_BIDI_PAIRED_BRACKET_TYPE_CLOSE 1 +#define UCDN_BIDI_PAIRED_BRACKET_TYPE_NONE 2 + +/** + * Return version of the Unicode database. + * + * @return Unicode database version + */ +const char *ucdn_get_unicode_version(void); + +/** + * Get combining class of a codepoint. + * + * @param code Unicode codepoint + * @return combining class value, as defined in UAX#44 + */ +int ucdn_get_combining_class(uint32_t code); + +/** + * Get east-asian width of a codepoint. + * + * @param code Unicode codepoint + * @return value according to UCDN_EAST_ASIAN_* and as defined in UAX#11. + */ +int ucdn_get_east_asian_width(uint32_t code); + +/** + * Get general category of a codepoint. + * + * @param code Unicode codepoint + * @return value according to UCDN_GENERAL_CATEGORY_* and as defined in + * UAX#44. + */ +int ucdn_get_general_category(uint32_t code); + +/** + * Get bidirectional class of a codepoint. + * + * @param code Unicode codepoint + * @return value according to UCDN_BIDI_CLASS_* and as defined in UAX#44. + */ +int ucdn_get_bidi_class(uint32_t code); + +/** + * Get script of a codepoint. + * + * @param code Unicode codepoint + * @return value according to UCDN_SCRIPT_* and as defined in UAX#24. + */ +int ucdn_get_script(uint32_t code); + +/** + * Get unresolved linebreak class of a codepoint. This does not take + * rule LB1 of UAX#14 into account. See ucdn_get_resolved_linebreak_class() + * for resolved linebreak classes. + * + * @param code Unicode codepoint + * @return value according to UCDN_LINEBREAK_* and as defined in UAX#14. + */ +int ucdn_get_linebreak_class(uint32_t code); + +/** + * Get resolved linebreak class of a codepoint. This resolves characters + * in the AI, SG, XX, SA and CJ classes according to rule LB1 of UAX#14. + * In addition the CB class is resolved as the equivalent B2 class and + * the NL class is resolved as the equivalent BK class. + * + * @param code Unicode codepoint + * @return value according to UCDN_LINEBREAK_* and as defined in UAX#14. + */ +int ucdn_get_resolved_linebreak_class(uint32_t code); + +/** + * Check if codepoint can be mirrored. + * + * @param code Unicode codepoint + * @return 1 if mirrored character exists, otherwise 0 + */ +int ucdn_get_mirrored(uint32_t code); + +/** + * Mirror a codepoint. + * + * @param code Unicode codepoint + * @return mirrored codepoint or the original codepoint if no + * mirrored character exists + */ +uint32_t ucdn_mirror(uint32_t code); + +/** + * Get paired bracket for a codepoint. + * + * @param code Unicode codepoint + * @return paired bracket codepoint or the original codepoint if no + * paired bracket character exists + */ +uint32_t ucdn_paired_bracket(uint32_t code); + +/** + * Get paired bracket type for a codepoint. + * + * @param code Unicode codepoint + * @return value according to UCDN_BIDI_PAIRED_BRACKET_TYPE_* and as defined + * in UAX#9. + * + */ +int ucdn_paired_bracket_type(uint32_t code); + +/** + * Pairwise canonical decomposition of a codepoint. This includes + * Hangul Jamo decomposition (see chapter 3.12 of the Unicode core + * specification). + * + * Hangul is decomposed into L and V jamos for LV forms, and an + * LV precomposed syllable and a T jamo for LVT forms. + * + * @param code Unicode codepoint + * @param a filled with first codepoint of decomposition + * @param b filled with second codepoint of decomposition, or 0 + * @return success + */ +int ucdn_decompose(uint32_t code, uint32_t *a, uint32_t *b); + +/** + * Compatibility decomposition of a codepoint. + * + * @param code Unicode codepoint + * @param decomposed filled with decomposition, must be able to hold 18 + * characters + * @return length of decomposition or 0 in case none exists + */ +int ucdn_compat_decompose(uint32_t code, uint32_t *decomposed); + +/** + * Pairwise canonical composition of two codepoints. This includes + * Hangul Jamo composition (see chapter 3.12 of the Unicode core + * specification). + * + * Hangul composition expects either L and V jamos, or an LV + * precomposed syllable and a T jamo. This is exactly the inverse + * of pairwise Hangul decomposition. + * + * @param code filled with composition + * @param a first codepoint + * @param b second codepoint + * @return success + */ +int ucdn_compose(uint32_t *code, uint32_t a, uint32_t b); + +#ifdef __cplusplus +} +#endif + +#endif diff --git a/libs/libmupdf_linux_amd64.a b/libs/libmupdf_linux_amd64.a new file mode 100644 index 0000000..4674cb3 Binary files /dev/null and b/libs/libmupdf_linux_amd64.a differ diff --git a/libs/libmupdfthird_linux_amd64.a b/libs/libmupdfthird_linux_amd64.a new file mode 100644 index 0000000..63f0697 Binary files /dev/null and b/libs/libmupdfthird_linux_amd64.a differ diff --git a/testdata/test.bmp b/testdata/test.bmp new file mode 100644 index 0000000..02dc6d6 Binary files /dev/null and b/testdata/test.bmp differ diff --git a/testdata/test.cbz b/testdata/test.cbz new file mode 100644 index 0000000..168f6f5 Binary files /dev/null and b/testdata/test.cbz differ diff --git a/testdata/test.epub b/testdata/test.epub new file mode 100644 index 0000000..9ccf430 Binary files /dev/null and b/testdata/test.epub differ diff --git a/testdata/test.fb2 b/testdata/test.fb2 new file mode 100644 index 0000000..f1292bf --- /dev/null +++ b/testdata/test.fb2 @@ -0,0 +1,11 @@ + + + + + Hello World + + + +
Hello World
+ +
diff --git a/testdata/test.gif b/testdata/test.gif new file mode 100644 index 0000000..d5e923d Binary files /dev/null and b/testdata/test.gif differ diff --git a/testdata/test.jb2 b/testdata/test.jb2 new file mode 100644 index 0000000..ae48d91 Binary files /dev/null and b/testdata/test.jb2 differ diff --git a/testdata/test.jp2 b/testdata/test.jp2 new file mode 100644 index 0000000..63714f9 Binary files /dev/null and b/testdata/test.jp2 differ diff --git a/testdata/test.jpg b/testdata/test.jpg new file mode 100644 index 0000000..3219d31 Binary files /dev/null and b/testdata/test.jpg differ diff --git a/testdata/test.jxr b/testdata/test.jxr new file mode 100644 index 0000000..3e4fc35 Binary files /dev/null and b/testdata/test.jxr differ diff --git a/testdata/test.pam b/testdata/test.pam new file mode 100644 index 0000000..1653f1f --- /dev/null +++ b/testdata/test.pam @@ -0,0 +1,10 @@ +P7 +WIDTH 3 +HEIGHT 3 +DEPTH 1 +MAXVAL 1 +TUPLETYPE blackandwhite +ENDHDR +1 0 1 +0 1 0 +1 0 1 \ No newline at end of file diff --git a/testdata/test.pbm b/testdata/test.pbm new file mode 100644 index 0000000..fbcb39d --- /dev/null +++ b/testdata/test.pbm @@ -0,0 +1,5 @@ +P1 +3 3 +1 0 1 +0 1 0 +1 0 1 \ No newline at end of file diff --git a/testdata/test.pdf b/testdata/test.pdf new file mode 100644 index 0000000..0afee43 Binary files /dev/null and b/testdata/test.pdf differ diff --git a/testdata/test.pfm b/testdata/test.pfm new file mode 100644 index 0000000..4b51ce6 Binary files /dev/null and b/testdata/test.pfm differ diff --git a/testdata/test.pgm b/testdata/test.pgm new file mode 100644 index 0000000..d2728b9 --- /dev/null +++ b/testdata/test.pgm @@ -0,0 +1,6 @@ +P2 +3 3 +2 +0 1 2 +0 1 2 +0 1 2 \ No newline at end of file diff --git a/testdata/test.png b/testdata/test.png new file mode 100644 index 0000000..a7045ed Binary files /dev/null and b/testdata/test.png differ diff --git a/testdata/test.ppm b/testdata/test.ppm new file mode 100644 index 0000000..63b65ad --- /dev/null +++ b/testdata/test.ppm @@ -0,0 +1,12 @@ +P3 +3 3 +255 +255 0 0 + 0 255 0 + 0 0 255 +255 255 0 + 0 255 0 +255 0 255 +255 255 255 + 0 255 0 + 0 255 255 \ No newline at end of file diff --git a/testdata/test.tif b/testdata/test.tif new file mode 100644 index 0000000..c131b8f Binary files /dev/null and b/testdata/test.tif differ diff --git a/testdata/test.xps b/testdata/test.xps new file mode 100644 index 0000000..d82f61e Binary files /dev/null and b/testdata/test.xps differ