Terry's Worklog – On my work (programming, digital libraries, cataloging) and other stuff that perks my interest (family, cycling, etc) Skip to content Home About Me MarcEdit Homepage GitHub Page Privacy Policy Terry's Worklog MarcEdit 7.3.x/7.5.x (beta) Updates By reeset / On February 2, 2021 / In MarcEdit Versions are available at: https://marcedit.reeset.net/downloads Information about the changes: 7.3.10 Change Log: https://marcedit.reeset.net/software/update7.txt 7.5.0 Change Log: https://marcedit.reeset.net/software/update75.txt If you are using 7.x – this will prompt as normal for update. 7.5.x is the beta build, please be aware I expect to be releasing updates to this build weekly and also expect to find some issues. Questions, let me know. –tr MarcEdit 7.5.x/MacOS 3.5.x Timelines By reeset / On January 26, 2021 / In MarcEdit I sent this to the MarcEdit Listserv to provide info about my thoughts around timelines related to the beta and release.  Here’s the info. Dear All, As we are getting close to Feb. 1 (when I’ll make the 7.5 beta build available for testing) – I wanted to provide information about the update process going forward. Feb. 1: MarcEdit 7.5 Download will be released.  This will be a single build that includes both the 32 and 64 bit builds, dependencies, and can install if you have Admin rights or non-admin rights. I expect to be releasing new builds weekly – with the goal of taking the beta tag off the build no later than April 1. MarcEdit 7.3.x I’ll be providing updates for 7.3.x till 7.5 comes out of beta.  This will fold in some changes (mostly bug fixes) when possible.  MarcEdit MacOS 3.2.x I’ll be providing Updates for MacOS 3.2.x till 3.5 is out and out of beta MarcEdit MacOS 3.5.x Beta Once MarcEdit 7.5.x beta is out, I’ll be looking to push a 3.5.x beta by mid-late Feb.  Again, with the idea of taking the beta tag off by April (assuming I make the beta timeline) March 2021 MarcEdit MacOS 3.5.x beta will be out and active (with weekly likely builds) MarcEdit 7.5.x beta – testing assessed and then determine how long the beta process continues (with April 1 being the end bookend date) MarcEdit 7.3.x – Updates continue MarcEdit MacOS 3.2.x – updates continue April 2021 MarcEdit 7.5.x comes out of Beta MarcEdit 7.3.x is deprecated MarcEdit MacOS 3.5.x beta assessed – end bookend date is April 15th if above timelines are met May 2021 MarcEdit MacOS 3.5.x is out of beta MarcEdit MacOS 3.2.x is deprecated Let me know if you have questions. MarcEdit 7.5 Change/Bug Fix list By reeset / On January 20, 2021 / In MarcEdit * Updated; 1/20 Change: Allow OS to manage supported supported Security Protocol types. Change: Remove com.sun dependency related to dns and httpserver Change: Changed AppData Path Change: First install automatically imports settings from MarcEdit 7.0-2.x Change: Field Count – simplify UI (consolidate elements) Change: 008 Windows — update help urls to oclc Change: Generate FAST Headings — update help urls Change: .NET changes thread stats queuing. Updating thread processing on forms: * Generate FAST Headings * Batch Process Records * Build Links * Main Window * RDA Helper * Delete Selected Records * MARC Tools * Check URL Tools * MARCValidator Change: XML Function List — update process for opening URLs Change: Z39.50 Preferences Window – update process for opening URLs Change: About Windows — new information, updated how version information is calculated. Change: Catalog Calculator Window — update process for opening URLs Change: Generate Call Numbers — update process for opening URLs Change: Generate Material Formats — update process for opening URLs Change: Tab Delimiter — remove context windows Change: Tab Delimiter — new options UI Change: Tab Delimiter — normalization changes Change: Remove Old Help HTML Page Change: Remove old Hex Editor Page Change: Updated Hex Editor to integrate into main program Change: Main Window — remove custom scheduler dependency Change: UI Update to allow more items Change: Main Window — new icon Change: Main Window — update process for opening URLs Change: Main Window — removed context menus Change: Main Window — Upgrade changes to new executable name Change: Main Window — Updated the following menu Items: * Edit Linked Data Tools * Removed old help menu item * Added new application shortcut Change: OCLC Bulk Downloader — new UI elements to correspond to new OCLC API Change: OCLC Search Page — new UI elements to correspond to new OCLC API Change: Preferences — Updates related to various preference changes: * Hex Editor * Integrations * Editor * Other Change: RDA Helper — update process for opening URLs Change: RDA Helper — Opening files for editing Change: Removed the Script Maker Change: Templates for Perl and vbscripts includes Change: Removed Find/Search XML in the XML Editor and consolidated in existing windows Change: Delete Selected Records: Exposed the form and controls to the MarcEditor Change: Sparql Browser — update process for opening URLs Change: Sparql Browser — removed context menus Change: TroubleShooting Wizard — Added more error codes and kb information to the Wizard Change: UNIMARC Utility — controls change, configurable transform selections Change: MARC Utilities — removed the context menu Change: First Run Wizard — new options, new agent images Change: XML Editor — Delete Block Addition Change: XML Editor — XQuery transform support Change: XML Profile Wizard — option to process attributes Change: MarcEditor — Status Bar control doesn’t exist in NET 5.0. Control has changed. Change: MarcEditor — Improved Page Loading Change: MarcEditor — File Tracking updated to handle times when the file opened is a temp record Change: MarcEditor — removed ~7k of old code Change: MarcEditor — Added Delete Selected Records Option Change: Removed helper code used by Installer Change: Removed Office2007 menu formatting code Change: Consolidated Extensions into new class (removed 3 files) Change: Removed calls Marshalled to the Windows API — replaced with Managed Code Change: OpenRefine Format handler updated to capture changes between OpenRefine versions Change: MarcEngine — namespace update to 75 Bug Fix: Main Window — corrects process for determining version for update Bug Fix: Main Window — Updated image Bug Fix: When doing first run, wizard not showing in some cases. Bug Fix: Main Window — Last Tool used sometimes shows duplicates Bug Fix: RDA Helper — $e processing Bug Fix: RDA Helper — punctuation in the $e Bug Fix: XML Profile Wizard — When the top element is selected, it’s not viewed for processing (which means not seeing element data or attribute data) Bug Fix: MarcEditor — Page Processing correct to handle invalid formatted data better MarcEdit 7.5 Updates By reeset / On January 12, 2021 / In MarcEdit Current list of MarcEdit 7.5 general updates.  I’ll be walking through many of these changes in a webinar 1/15. Significant Changes: Targeted Framework: .NET 5.0 (What’s new in .NET 5 | Microsoft Docs) XML Wizard Changes Support for Attribute-based mapping (extends previous entity based mapping) Linked Data Components updated SPARQL Components Updated Linked Data Rules File Format Updates Multiple rule blocks for the same field number allowed Allows for redirection of URI to different fields (than one’s evaluated) Delimited Text Translator Ability to add custom mnemonic replacements (any {000$a} option allowed) No longer a stand alone program Now a part of main MarcEdit app Command-Line options integrated into the MarcEdit Command-Line options OCLC Updates Shift to Metadata Search API (removes reliance on old search API) New indexes available for query Examples: catalog date, catalog org, etc. UNIMARC Tools Shifted to MARC Flavors Allows users to add new translations Configuration-based New Chinese MARC 2 MARC21 translation added Incremental Changes Upgrade Wizard updates RDA Helper 100$e updated to assess both the $e and $4 Updates related to updated RDA Will assess only bibliographic records if bibliographic, authority, or holdings records included in the same file. UI Updates Updated Home Screen with added apps the main screen Integration with the command-line tool XML Editor Record specific tools Delete Block XQuery Processing Option Installer Exposed the extensions to the install process Embedded necessary dependencies (.NET) in the installer Single Installer (32 or 64 bit) Linked Data New Collections Wikidata for example MarcEdit 7.5 Update Status By reeset / On November 30, 2020 / In Uncategorized I’m planning to start making testing versions of the new MarcEdit instance available around the first of the year broadly, to a handful of testers in mid-Dec.  The translation from .NET 4.7.2 to .NET 5 was more significant than I would have thought – and includes a number of swapped default values – so hunting down behavior changes.  Currently, the follow updates have been completed. Framework used: .NET 5.0 RDA Helper: 100$e process modified. Added criteria to $e generation. Previously, if a $e is already present, an new $e wasn’t added. Now, if a $e or $4 is present, a $e won’t be generated. RDA Helper: Changes related to RDA updates Added new elements to the new window programs for pinning XML Editor: Delete Block element added XML Editor: XQuery processing option If a set of records include bibliographic and authority records, the RDA helper will skip the authority records Updated Installation Wizard (allows migration of 6.x and 7.x content into the tool) Updating OCLC Integration to use new Metadata API Search Delimited Text Translator — added ability to use custom mnemonic replacements Delimited Text Translator — no longer a stand alone program App part of main marcedit app Command line options folded into marcedit app [in process] linked data rules file version 2 Enhancements to the rules file schema -tr Changes to System.Diagnostics.Process in .NET Core By reeset / On November 19, 2020 / In Uncategorized In .NET Core, one of the changes that caught me by surprise is the change related to starting processes.  In the .NET framework – you can open a web site, file, etc. just by using the following:\ System.Diagnostics.Process.Start(path); However, in .NET Core – this won’t work.  When trying to open a file, the process will fail – reporting that a program isn’t associated with the file type.  When trying to open a folder on the system, the process will fail with a permission error unless the application is running with administrator permissions (which you don’t want to be doing).  The change is related to a change in a property default – specifically: System.Diagnostics.ProcessStartInfo.UseShellExecute In the .NET framework – this property is set to true by default.  In the .NET Core, it is set to false.  The difference here probably makes sense – .NET Core is meant to be more portable and you do need to change this value on some systems.  To fix this, I’d recommend removing any direct calls to this assembly and run in through a function like this: public static void OpenURL(string url) { var psi = new System.Diagnostics.ProcessStartInfo { FileName = url, UseShellExecute = true }; try { System.Diagnostics.Process.Start(psi); } catch { psi.UseShellExecute = false; System.Diagnostics.Process.Start(psi); } } public static void OpenFileOrFolder(string spath, string sarg = "") { var psi = new System.Diagnostics.ProcessStartInfo { FileName = spath, UseShellExecute = true }; try { System.IO.FileAttributes attr = System.IO.File.GetAttributes(spath); if ((attr & System.IO.FileAttributes.Directory) == System.IO.FileAttributes.Directory) { System.Diagnostics.Process.Start(psi); } else { if (sargs.Trim().Length !=0) { psi.Arguments = sargs; } System.Diagnostics.Process.Start(psi); } } catch { psi.UseShellExecute = false; System.IO.FileAttributes attr = System.IO.File.GetAttributes(spath); if ((attr & System.IO.FileAttributes.Directory) == System.IO.FileAttributes.Directory) { System.Diagnostics.Process.Start(psi); } else { if (sargs.Trim().Length !=0) { psi.Arguments = sargs; } System.Diagnostics.Process.Start(psi); } } Since this vexed me for a little bit – I’m putting this here so I don’t forget. tr MarcEdit 7.5/MarcEdit Mac 3.5 Work By reeset / On November 16, 2020 / In MarcEdit Every year, around this time, I try to dedicate significant time to address any large project work that may have been percolating around MarcEdit.  This year will be no different.  Over the past 4 months, I’ve been working on moving MarcEdit away from the .NET 4.7.2 Framework to .NET Core 3.1.  There a lot of reasons for looking at this, the most important being that this is the direction Microsoft is taking the framework – a move to unify the various .NET development platforms to make distribution and maintenance easier.  Well, with the release of .NET 5 this Nov., all the tools I need to officially make this transition are now in place. So, over the next two months, I’ll be working on shifting MarcEdit away from Framework 4.7.2 and to .NET 5.  I believe this will be possible – I only have concerns about two libraries that I rely on – and if I have to, both are open source so I can look at potentially spending time helping the project maintainers target a non-framework build.  My hope is to have a working version of MarcEdit using NET 5 by Thanksgiving that I can start unit testing and testing locally.  Of course, with this change, I’ll also have to change the installer process.  The reason is that this transition will remove the necessity of having to have .NET installed on one’s machine.  One of the changes to the framework is the ability to publish self contained applications – allowing for faster startup and lower memory usage.  This is something I’m excited about as I currently move slowly updating build frameworks due to the need to have these frameworks installed locally.  By removing that dependency, I’m hoping to be able to take advantages of changes to the C# language that make programming easier and more efficient, while also allowing me to remove some of the work around code I’ve had to develop to account for bugs or limitations in previous frameworks. Finally, this change is going to simplify a lot of cross platform development – and once the initial transition has occurred, I’ll be spending time working on expanding the MarcEdit MacOS version.  There are a couple of areas where this program still lacks parity in relation to the Windows version, and these changes will give me the opportunity to close many of these gaps.  –tr MarcEdit: Identifying Invalid UTF-8 Data in MARC Records By reeset / On September 9, 2020 / In MarcEdit Ah Dante – if only he had been a librarian.  I’m almost certain that had the divine comedy been written by a cataloger – character encodings and those that mangle them – would definitely make an appearance.  I can almost see the story in my head.  Our wayward traveler, confused when our guide, Virgil, comments on the unholy mess libraries, vendors, and tool writers in general have made of the implementation of UTF-8 across the library spectrum – takes us to the 5th circle of hell filled with broken characters and undefined character boxes.  But spend anytime working in metadata management today, and the problems of mixed Unicode normalizations, the false equivalency of ISO-8859-2 and UTF-8 (especially by vendors that server Western European markets), lackluster font development, and applications and programming languages that quietly and happily mangle UTF-8 data as part of general course – and you can suddenly see why we might make a stop at the lake of fire and eternal damnation. Within MarcEdit, one of the hardest things that the application does is attempt to correct and normalize character encodings across the various known codepoints.  This isn’t super easy – especially when our MARC forepersons made that fateful decision to create MARC-8, a 100% imaginary character encoding only (kind of) supported within the Library community and software.  These kinds of decisions, and the desire to maintain legacy compatibility, has haunted our metadata and made working with it immensely complicated.  Sometimes, these complications can be managed, other times, they are so gruesomely mangled that Brutus, himself, would cry yield.  That’s what this new option will attempt to help remediate. Through the years, I’ve often helped individuals come up with a wide variety of ways to identify invalid UTF-8 characters that litter library records.  Sometimes, this can be straightforward, but more often, it’s not.  To that end, I’ve attempted to provide a couple of tools that will hopefully help to identify and support some kind of remediation for catalogers haunted by the specter of bad data. Identification The first enhancement comes in the MARCValidator.  When validating a record against the rules file, the tool will automatically attempt to determine if UTF-8 data (if present) found within a record is valid.  If not, the information will be presented as a warning – identifying the field, record number, and data where the invalid data was identified. By facilitating a process to identify invalid UTF-8 record data within the validator – the idea is that this will empower catalogers looking to take a more active role in rooting out bad diacritical data before a record is loaded into the catalog and  made available to the public. Removing bad data In addition to identification, I’ve added three new options to give users different options for dealing with invalid character data. Delete Subfields Added to the Edit Subfield Utility – I’ve included an option to evaluate and delete a subfield if invalid character set data is encountered. Delete Fields Added to the Add/Delete Field Utility – I’ve included an option to evaluate and delete a field if invalid character set data is encountered. Delete Records Added to the Delete Records tool within the MarcEditor – I’ve included an option to delete a record if a field or field group has been identified as having invalid character set data.  Additionally, this tool will create a second file in the same directory as the file being processed, that will contain the deleted records in a file structured as: [name of original file]_bad_yyyyMMddhhmmss.mrk Caveat Emptor Hopefully, the above sounds useful.  I think it will be.  There have been many times where I wish I had these tools readily at my fingertips.  If it were only this easy.  I believe I mentioned above….encodings are difficult.  The Unicode specification is constantly changing, and identifying invalid characters is definitely more art than science in many cases.  There are tools and established algorithms.  I use these approaches.  I’m also leveraging a method with the .NET Framework — CharUnicodeInfo.GetUnicodeCategory – which attempts to take a character and break it down into its character classification.  When a character isn’t classified – that’s usually a good indicator that it’s not valid.  But this process won’t catch everything – but it hopefully will provide a good starting point for users vexed with these issues and in need of a tool in their toolbox to attempt to remediate them. Conclusion My hope is that these new options will give catalogers a little more control and insight into their records – specifically given how invisible character encoding issues often are.  And maybe too, by shedding light on this most vexing of issues, I can buy myself a little less time in cataloging purgatory as I’m sure there will come a point, somewhere, sometime, where my own contributions to keeping MARC alive and active will be held to account. These new options will show up in MarcEdit and MarcEdit Mac in versions 7.2.210 (Windows) and 3.2.100 (Mac). Questions, let me know. –tr [1] The fifth circle, illustrated by Stradanus (https://en.wikipedia.org/wiki/Inferno_(Dante)#/media/File:Stradano_Inferno_Canto_08.jpg) MarcEdit 7.2.200 By reeset / On August 30, 2020 / In MarcEdit I’ve worked on a number of updates this weekend– here is the list: UI Changes I’ve removed the quick links on the front page, and changed this to a list of selectable topics.  This will make it easier for me to add to this list. I’ve added a new Quick Access button to the top ribbon.  At this point, this isn’t configurable.  Will work to make it configurable later. These Quick Access items have been added to the Marc Tools window – with the removal of the old quick links as well. Network Changes MarcEdit uses .Net 4.7.2.  Internally, the tool has traditionally used the HTTPWebRequest Assembly.  Accessing this assembly directly has been deprecated, with the preferred method shifting to the System.Net.Http Assembly.  This is object is thread-safe and works natively with the System.Threading.Tasks structure.  This also has the benefit of allowing me to allow .NET to gracefully support older TLS standards, which isn’t the default.  By default, .NET selects support for the default TLS instance utilized by the operating system and disables older standards.  This is problematic – and these changes will give me more control over which TLS instances are supported and how fallback is supported.  This required updating 9 assemblies. MarcEditor Changes Bug Fix:  When Opening mrc records into the MarcEditor, a memory leak can occur with large files.  I’ve corrected this. Bug Fix: MarcEdit uses a custom created control that allows the tool to select the most current version of the Richtext library when showing the MarcEditor.  In .NET 4.7 – there appears to have been behavior change, in the that names used to register classes in Windows needed to be all upper case.  If they weren’t then an error would be thrown when mixing the enhanced control and the .NET frameworks default Richtextbox control (which uses the older richtext library).  For example: if internally, the enhanced control used RichEdit5W and then the Richtextbox was used, the program would throw an error.  This wasn’t a problem in MarcEdit, because I only use the enhanced control, but users that may create plugins against MarcEdit may experience issues.  The correction is the use uppercase text to normalize class names now used by .NET 4.7+ (Example: RICHEDIT5W). Z39.50/SRU Changes Enhancement: Cleaned up some code related to how records display inside the Results Viewer when pulling non-MARC data. Validate Headings Behavior Change: Check $a Only with Subjects.  When working with 60x or 610– this setting doesn’t work like folks might expect.  This is because names often include additional information that must be provided or false variants can be noted.  When working with 60x or 610 data – the program will now include all subfields used when validating the 1xx fields and update data with variants accordingly.  When $a isn’t selected, then the tool will utilize all fields noted as used for validation in the rules file.  This is a behavior change, but likely more in line with the expectations that I’m guess most folks have when using the $a option. Behavior Change: When changing variants – it appears that multiple $a’s would be placed.  I’m not sure if there was a change on the source record side or not – so instead, I just updated the code to ensure that the tool validated specific data before making updates. –tr Build New Field Changes By reeset / On July 25, 2020 / In MarcEdit ** Updated: Official Help page in the KB: https://marcedit.reeset.net/build-new-field This isn’t going to meet all the use cases I’ve seen – but this should address the most common question that comes up – the ability to have the build new field generate multiple fields. The process will be based on the presence or lack of a new element in the pattern – a variable marker that will MarcEdit uses internally to hold an internal variable. Example: =040  \\$aMiU$cMiU =040  \\$aBDS$beng$cBDS$dOCLCQ$dABCU =041  \\$aengrusger =043  \\$ae-gx—$ae-uk—$an-us— =090  \\$aTK1005$b(INTERNET) $c[UK.] Say we have these fields – and the pattern I want to create is a 999 field, and in that field, I want to create a new 999 field for each 040$a – but I would also like to have the 090$a to be a part of the pattern. The new pattern would look like this: =999  \\$a{040$a[x]} : {090$a} This pattern would generate the following results: =999  \\$aMiU : TK1005 =999  \\$aBDS : TK1005 If I changed the pattern to: =999  \\$a{040$a} : {090$a} The program falls back to use the current functionality (only one field is created). Please note, you cannot ask for a specific 040 to be used (outside of using find/reg functions inside the pattern) – the data inside the [x] isn’t an integer you can set.  It is a value that indicates to MarcEdit that the subfield should be tracked and multiple fields are desired. The [x] syntax works both after the subfield or after the field number, with data being scoped based on the location of the [x].  Any other value other than [x] will likely result in inconsistent results.  The [x] bracket is a reserved element within the field to indicate that multiple field generation is desired, and to tell the program to tokenize the data marked. Finally – the tool placed data in the index range of the new field being generated.  So, consider this example: =040  \\$aMiU$cMiU =040  \\$aBDS$beng$cBDS$dOCLCQ$dABCU =041  \\$aengrusger =043  \\$ae-gx—$ae-uk—$an-us— =090  \\$aTK1005$b(INTERNET) $c[UK.] If I used the following pattern: =999  \\$a{040$a[x]} : {090$a[x]} The expected results would be: =999  \\$aMiU : TK1005 =999  \\$aBDS : Why?  Because the tool will slot values marked with the multi-field value [x] into the same field groups.  Since only one 090$a exists, the tool only updates the field group that it belongs.  However, if I had the following data: =040  \\$aMiU$cMiU =040  \\$aBDS$beng$cBDS$dOCLCQ$dABCU =041  \\$aengrusger =043  \\$ae-gx—$ae-uk—$an-us— =090  \\$aTK1005$b(INTERNET) $c[UK.] =090  \\$aG24211$b(INTERNET) And used this pattern: =999  \\$a{040$a[x]} : {090$a[x]} I would expect the following result: =999  \\$aMiU : TK1005 =999  \\$aBDS : G24211 Again – internally, MarcEdit is creating tokens of data with the [x] and placing them within the same scope.  So, the tool would create new fields, placing data within the same scope onto the new fields. I started making these changes with the last update – and have finished updating the tokenization algorithms so that the tracking of the data is correct.  I’ll be turning this new option on with the next update – and across both the Windows and Mac version. Since the presence of the [x] is necessary to turn on the multi-field generation, any existing patterns within tasks shouldn’t be impacted by the changes.  They will work as they had previously.  Only patterns with the new [x] structure will activate the new processing logic. Posts navigation Older posts Search for: Terry's Worklog Terry's Worklog © 2021. - Created by Slicejack.