on

Members – Block Permissions

Announcement of Members - Block Permissions, a WordPress plugin for showing/hiding content using the block editor (Gutenberg).




on

Exhale Version 2.2.0

Release announcement of version 2.2.0 of the Exhale WordPress theme.




on

I’ve shot at this location a few times but for some reason...



I’ve shot at this location a few times but for some reason I’ve never seen it from the other side. Literal proof that shooting with other creatives gives you new perspective. ???? (at Toronto, Ontario)




on

Thanks for all the positive support and reception to my...



Thanks for all the positive support and reception to my Lightroom presets so far, especially to those who pulled the trigger and became my first customers! I’d love to hear your feedback once you try them out!
.
Still time to enter the giveaway or to take advantage of the 50% sale! See my last post for full details and the link in my profile. ❤️ (at Toronto, Ontario)




on

Bricks are better black. ◾️ (at Toronto, Ontario)



Bricks are better black. ◾️ (at Toronto, Ontario)




on

Lights, camera, action. ???? — A few more days left to get 50% off...



Lights, camera, action. ????

A few more days left to get 50% off my custom Lightroom presets! Link in profile. (at Toronto, Ontario)




on

Merry Xmas everyone! It’s giveaway time! ???????? . Thank you to...



Merry Xmas everyone! It’s giveaway time! ????????
.
Thank you to all those who participated in my preset giveaway this week! The support makes all the hard work and extra effort worth it!
.
Without further ado, the randomly drawn winners of my custom Lightroom presets are @l9lee @rchellau @bokeh.jay! Congrats and check your DMs soon for details! ????
.
You still have until tomorrow to grab my presets (which this shot was edited with) for 50% off! They’ll be going back to regular price after so don’t miss out! ???? (at Toronto, Ontario)




on

Trying to straighten all the lines on this shot is a sure fire...



Trying to straighten all the lines on this shot is a sure fire way to go blind. ???? (at London, United Kingdom)




on

I’ve gone subway hopping for photos in every city...



I’ve gone subway hopping for photos in every city I’ve been to except the one I live in. ???? (at Toronto, Ontario)




on

I just realized that I can export my entire story all at once...



I just realized that I can export my entire story all at once now, which means uploading my tutorials to my Facebook page will be a million times easier (it was tedious to stitch all the individual clips together before). ????
.
Related: I posted a story this morning deconstructing the edit on yesterday’s shot.
.
Also related: I uploaded the 3 tutorials from my November feature on @thecreatorclass to my Facebook page this morning too. More to come! (at London, United Kingdom)




on

This might as well be a Herschel ad. ???? (at London, United...



This might as well be a Herschel ad. ???? (at London, United Kingdom)




on

This trip solidified my conviction to learning photography. A...



This trip solidified my conviction to learning photography. A lot has happened since this shot was taken.
Can you pinpoint the moment you decided to pursue photography? (at Toronto, Ontario)




on

Four days from now I’ll be boarding a one way flight to...



Four days from now I’ll be boarding a one way flight to San Francisco to take on the next evolution of my role at @shopify. Leaving the city that I’ve called home my entire life and the people who have defined everything I am was one of the most uncomfortable decisions I’ve ever had to make. But this wouldn’t be the first time I’ve chased discomfort in my career.
.
I wrote about my ongoing pursuit for discomfort this morning in hopes of inspiring others to do the things that scare and challenge them this year. You can find the link in my profile.
.
Happy 2017! ????
.
????: @jonasll (at San Francisco, California)




on

Quick survey: on average, what time is it when you check...



Quick survey: on average, what time is it when you check Instagram for the first time on any given day? (Be sure to include your timezone!)
.
PS: Thank you for all the incredible support on yesterday’s announcement. ❤️ (at Toronto, Ontario)







on

Self-promotion

The world has changed. Everything we do is more immediately visible to others than ever before, but much remains the same; the relationships we develop are as important as they always were. This post is a few thoughts on self-promotion, and how to have good relationships as a self-publisher.

Meeting people face to face is ace. They could be colleagues, vendors, or clients; at conferences, coffee shops, or meeting rooms. The hallway and bar tracks at conferences are particularly great. I always come away with a refreshed appreciation for meatspace. However, most of our interactions take place over the Web. On the Web, the lines separating different kinds of relationships are a little blurred. The company trying to get you to buy a product or conference ticket uses the same medium as your friends.

Freelancers and small companies (and co-ops!) can have as much of an impact as big businesses. ‘I publish therefore I am’ could be our new mantra. Hence this post, in a way. Although, I confess I have discussed these thoughts with friends and thought it was about time I kept my promise to publish them.

Publishing primarily means text and images. Text is the most prevalent. However, much more meaning is conveyed non-verbally. ‘It’s not what you say, it’s how you say it.’

Text can contain non-verbal elements like style — either handwritten or typographic characters — and emoticons, but we don’t control style in Twitter, email, or feeds. Or in any of the main situations where people read what we write (unless it’s our own site). Emoticons are often used in text to indicate tone, pitch, inflection, and emotion like irony, humour, or dismay. They plug gaps in the Latin alphabet’s scope that could be filled with punctuation like the sarcasm mark. By using them, we affirm how important non-verbal communication is.

The other critical non-verbal communication around text is karma. Karma is our reputation, our social capital with our audience of peers, commentators, and customers. It has two distinct parts: Personality, and professional reputation. ‘It’s not what was said, it’s who said what.’

So, after that quick brain dump, let me recap:

  • Relationships are everything.
  • We publish primarily in text without the nuance of critical non-verbal communication.
  • Text has non-verbal elements like style and emoticons, but we can only control the latter.
  • Context is also non-verbal communication. Context is karma: Character and professional reputation.

Us Brits are a funny bunch. Traditionally reserved. Hyperbole-shy. At least, in public. We use certain extreme adjectives sparingly for the most part, and usually avoid superlatives if at all possible. We wince a little if we forget and get super-excited. We sometimes prefer ‘spiffing’ accompanied by a wry, ironic smile over an outright ‘awesome’. Both are genuine — one has an extra layer in the inflection cake. However, we take great displeasure in observing blunt marketing messages that try to convince us something is true with massive, lobe-smacking enthusiasm, and some sort of exaggerated adjective-osmosis effect. We poke fun at attempts to be overly cool. We expect a decent level of self-awareness and ring of honesty from people who would sell us stuff. The Web is no exception. In fact, I may go so far as to say that the sensibilities of the Web are fairly closely aligned with British sensibilities. Without, of course, any of our crippling embarrassment. In an age when promoting oneself on the Web is almost required for designers, that’s no bad thing. After all, running smack bang through the middle of the new marketing arts is a large dose of reality; we’re just a bunch of folks telling our story. No manipulation, cool-kid feigned nonchalance, or lobe-smacking enthusiasm required.

Consider what the majority of designers do to promote themselves in this brave new maker-creative culture. People like my friend, Elliot Jay Stocks: making his own magazine, making music, distributing WordPress themes, and writing about his experiences. Yes, it is important for him that he has an audience, and yes, he wants us to buy his stuff, but no, he won’t try to impress or trick us into liking him. It’s our choice. Compare this to traditional advertising that tries to appeal to your demographic with key phrases from your tribe, life-style pitches, and the usual raft of Freudian manipulations. (Sarcasm mark needed here, although I do confess to a soft spot for the more visceral and kitsch Freudian manipulations.)

There is a middle ground between the two though. A dangerous place full of bad surprises: The outfit that seems like a human being. It appears to publish just like you would. They want money in exchange for their amazing stuff they’re super-duper proud of. Then, you find out they’re selling it to you at twice the price it is in the States, or that it crashes every time it closes, or has awful OpenType support. You find out the human being was really a corporate cyborg who sounds like you, but is not of you, and it’s impervious to your appeals to human fairness. Then there are the folks who definitely are human, after all they’re only small, and you know their names. All the non-verbal communication tells you so. Then you peek a little closer —  you see the context — and all they seem to do is talk about themselves, or their business. Their interactions are as carefully crafted as the big companies, and they treat their audience as a captive market. Great spirit forefend they share the bandwidth by celebrating anyone else. They sound like one of us, but act like one of them. Their popularity is inversely proportional to their humanity.

Extreme examples, I know. This is me exploring thoughts though, and harsh light helps define the edges. Feel free to sound off if it offends, but mind your non-verbal communication. :)

That brings me to self-promotion versus self-aggrandisement; there’s a big difference between the two. As independent designers and developer-type people, self-promotion is good, necessary, and often mutually beneficial. It’s about goodwill. It connects us to each other and lubricates the Web. We need it. Self-aggrandisement is coarse, obvious, and often an act of denial; the odour of insecurity or arrogance is nauseating. It is to be avoided.

If you consider the difference between a show-off and a celebrant, perhaps it will be clearer what I’m reaching for:

The very best form of self-promotion is celebration. To celebrate is to share the joy of what you do (and critically also celebrate what others do) and invite folks to participate in the party. To show off is a weakness of character — an act that demands acknowledgement and accolade before the actor can feel the tragic joy of thinking themselves affirmed. To celebrate is to share joy. To show-off is to yearn for it.

It’s as tragic as the disdainful, casual arrogance of criticising the output of others less accomplished than oneself. Don’t be lazy now. Critique, if you please. Be bothered to help, or if you can’t hold back, have a little grace by being discreet and respectful. If you’re arrogant enough to think you have the right to treat anyone in the world badly, you grant them the right to reciprocate. Beware.

Celebrants don’t reserve their bandwidth for themselves. They don’t treat their friends like a tricky audience who may throw pennies at you at the end of the performance. They treat them like friends. It’s a pretty simple way of measuring whether what you publish is good: would I do/say/act the same way with my friends? Human scales are always the best scales.

So, this ends. I feel very out of practise at writing. It’s hard after a hiatus. These are a few thoughts that still feel partially-formed in my mind, but I hope there was a tiny snippet or two in there that fired off a few neurons in your brain. Not too many, though, it’s early yet. :)




on

Web Fonts, Dingbats, Icons, and Unicode

Yesterday, Cameron Koczon shared a link to the dingbat font, Pictos, by the talented, Drew Wilson. Cameron predicted that dingbats will soon be everywhere. Symbol fonts, yes, I thought. Dingbats? No, thanks. Jason Santa Maria replied:

@FictiveCameron I hope not, dingbat fonts sort of spit in the face of accessibility and semantics at the moment. We need better options.

Jason rightly pointed out the accessibility and semantic problems with dingbats. By mapping icons to letters or numbers in the character map, they are represented on the page by that icon. That’s what Pictos does. For example, by typing an ‘a’ on your keyboard, and setting Pictos as the font-face for that letter, the Pictos anchor icon is displayed.

Other folks suggested SVG and JS might be better, and other more novel workarounds to hide content from assistive technology like screen readers. All interesting, but either not workable in my view, or just a bit awkward.

Ralf Herrmann has an elegant CSS example that works well in Safari.

Falling down with CSS text-replacement

A CSS solution in an article from Pictos creator, Drew Wilson, relies on the fact that most of his icons are mapped to a character that forms part of the common name for that symbol. The article uses the delete icon as an example which is mapped to ‘d’. Using :before and :after pseudo-elements, Drew suggests you can kind-of wrangle the markup into something sort-of semantic. However, it starts to fall down fast. For example, a check mark (tick) is mapped to ‘3’. There’s nothing semantic about that. Clever replacement techniques just hide the evidence. It’s a hack. There’s nothing wrong with a hack here and there (as box model veterans well know) but the ends have to justify the means. The end of this story is not good as a VoiceOver test by Scott at Filament Group shows. In fairness to Drew Wilson, though, he goes on to say if in doubt, do it the old way, using his font to create a background image and deploy with a negative text-indent.

I agreed with Jason, and mentioned a half-formed idea:

@jasonsantamaria that’s exactly what I was thinking. Proper unicode mapping if possible, perhaps?

The conversation continued, and thanks to Jason, helped me refine the idea into this post.

Jon Hicks flagged a common problem for some Windows users where certain Unicode characters are displayed as ‘missing character’ glyphs depending on what character it is. I think most of the problems with dingbats or missing Unicode characters can be solved with web fonts and Unicode.

Rising with Unicode and web fonts

I’d love to be able to use custom icons via optimised web fonts. I want to do so accessibly and semantically, and have optimised font files. This is how it could be done:

  1. Map the icons in the font to the existing Unicode code points for those symbols wherever possible.

    Unicode code points already exist for many common symbols. Fonts could be tiny, fast, stand-alone symbol fonts. Existing typefaces could also be extended to contain symbols that match the style of individual widths, variants, slopes, and weights. Imagine a set of Clarendon or Gotham symbols for a moment. Wouldn’t that be a joy to behold?

    There may be a possibility that private code points could be used if a code-point does not exist for a symbol we need. Type designers, iconographers, and foundries might agree a common set of extended symbols. Alternatively, they could be proposed for inclusion in Unicode.

  2. Include the font with font-face.

    This assumes ubiquitous support (as any use of dingbats does) — we’re very nearly there. WOFF is coming to Safari and with a bit more campaigning we may even see WOFF on iPad soon.

  3. In HTML, reference the Unicode code points in UTF-8 using numeric character references.

    Unicode characters have corresponding numerical references. Named entities may not be rendered by XML parsers. Sean Coates reminded me that in many Cocoa apps in OS X the character map is accessible via a simple CMD+ALT+t shortcut. Ralf Herrmann mentioned that unicode characters ‘…have “speaking” descriptions (like Leftwards Arrow) and fall back nicely to system fonts.’

Limitations

  1. Accessibility: Limited Unicode / entity support in assistive devices.

    My friend and colleague, Jon Gibbins’s old tests in JAWS 7 show some of the inconsistencies. It seems some characters are read out, some ignored completely, and some read as a question mark. Not great, but perhaps Jon will post more about this in the future.

    Elizabeth Pyatt at Penn State university did some dingbat tests in screen readers. For real Unicode symbols, there are pronunciation files that increase the character repertoire of screen readers, like this file for phonetic characters. Symbols would benefit from one.

  2. Web fonts: font-face not supported.

    If font-face is not supported on certain devices like mobile phones, falling back to system fonts is problematic. Unicode symbols may not be present in any system fonts. If they are, for many designers, they will almost certainly be stylistically suboptimal. It is possible to detect font-face using the Paul Irish technique. Perhaps there could be a way to swap Unicode for images if font-face is not present.

Now, next, and a caveat

I can’t recommend using dingbats like Pictos, but the icons sure are useful as images. Beautifully crafted icon sets as carefully crafted fonts could be very useful for rapidly creating image icons for different resolution devices like the iPhone 4, and iPad.

Perhaps we could try and formulate a standard set of commonly used icons using the Unicode symbols range as a starting point. I’ve struggled to find a better visual list of the existing symbols than this Unicode symbol chart from Johannes Knabe.

Icons in fonts as Unicode symbols needs further testing in assistive devices and using font-face.

Last, but not least, I feel a bit cheeky making these suggestions. A little knowledge is a dangerous thing. Combine it with a bit of imagination, and it can be lethal. I have a limited knowledge about how fonts are created, and about Unicode. The real work would be done by others with deeper knowledge than I. I’d be fascinated to hear from Unicode, accessibility, or font experts to see if this is possible. I hope so. It feels to me like a much more elegant and sustainable solution for scalable icons than dingbat fonts.

For more on Unicode, read this long, but excellent, article recommended by my colleague, Andrei, the architect of Unicode and internationalization support in PHP 6: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets.




on

Auphonic Leveler 1.8 and Auphonic Multitrack 1.4 Updates

Today we released free updates for the Auphonic Leveler Batch Processor and the Auphonic Multitrack Processor with many algorithm improvements and bug fixes for Mac and Windows.

Changelog

  • Linear Filtering Algorithms to avoid Asymmetric Waveforms:
    New zero-phase Adaptive Filtering Algorithms to avoid asymmetric waveforms.
    In asymmetric waveforms, the positive and negative amplitude values are disproportionate - please see Asymmetric Waveforms: Should You Be Concerned?.
    Asymmetrical waveforms are quite natural and not necessarily a problem. They are particularly common on recordings of speech, vocals and can be caused by low-end filtering. However, they limit the amount of gain that can be safely applied without introducing distortion or clipping due to aggressive limiting.
  • Noise Reduction Improvements:
    New and improved noise profile estimation algorithms and bug fixes for parallel Noise Reduction Algorithms.
  • Processing Finished Notification on Mac:
    A system notification (including a short glass sound) is now displayed on Mac OS when the Auphonic Leveler or Auphonic Multitrack has finished processing - thanks to Timo Hetzel.
  • Improved Dithering:
    Improved dithering algorithms - using SoX - if a bit-depth reduction is necessary during file export.
  • Auphonic Multitrack Fixes:
    Fixes for ducking and background tracks and for very short music tracks.
  • New Desktop Apps Documentation:
    The documentation of our desktop apps is now integrated in our new help system:
    see Auphonic Leveler Batch Processor and Auphonic Multitrack Processor.
  • Bug Fixes and Audio Algorithm Improvements:
    This release also includes many small bug fixes and all audio algorithms come with improvements and updated classifiers using the data from our Web Service.

About the Auphonic Desktop Apps

We offer two desktop programs which include our audio algorithms only. The algorithms will be computed offline on your device and are exactly the same as implemented in our Web Service.

The Auphonic Leveler Batch Processor is a batch audio file processor and includes all our (Singletrack) Audio Post Production Algorithms. It can process multiple productions at once.

Auphonic Multitrack includes our Multitrack Post Production Algorithms and requires multiple parallel input audio tracks, which will be analyzed and processed individually as well as combined to create one final mixdown.

Upgrade now

Everyone is encouraged to download the latest binaries:

Please let us know if you have any questions or feedback!






on

Facebook Live Streaming and Audio/Video Hosting connected to Auphonic

Facebook is not only a social media giant, the company also provides valuable tools for broadcasting. Today we release a connection to Facebook, which allows to use the Facebook tools for video/audio production and publishing within Auphonic and our connected services.

The following workflows are possible with Facebook and Auphonic:
  • Use Facebook for live streaming, then import, process and distribute the audio/video with Auphonic.
  • Post your Auphonic audio or video productions directly to the news feed of your Facebook Page or User.
  • Use Facebook as a general media hosting service and share the link or embed the audio/video on any webpage (also visible to non-Facebook users).

Connect to Facebook

First you have to connect to a Facebook account at our External Services Page, click on the "Facebook" button.

Select if you want to connect to your personal Facebook User or to a Facebook Page:

It is always possible to remove or edit the connection in your Facebook Settings (Tab Business Integrations).

Import (Live) Videos from Facebook to Auphonic

Facebook Live is an easy (and free) way to stream live videos:

We implemented an interface to use Facebook as an Incoming External Service. Please select a (live or non-live) video from your Facebook Page/User as the source of a production and then process it with Auphonic:

This workflow allows you to use Facebook for live streaming, import and process the audio/video with Auphonic, then publish a podcast and video version of your live video to any of our connected services.

Export from Auphonic to Facebook

Similar to Youtube, it is possible to use Facebook for media file hosting.
Please add your Facebook Page/User as an External Service in your Productions or Presets to upload the Auphonic results directly to Facebook:

Options for the Facebook export:
  • Distribution Settings
    • Post to News Feed: The exported video is posted directly to your news feed / timeline.
    • Exclude from News Feed: The exported video is visible in the videos tab of your Facebook Page/User (see for example Auphonic's video tab), but it is not posted to your news feed (you can do that later if you want).
    • Secret: Only you can see the exported video, it is not shown in the Facebook video tab and it is not posted to your news feed (you can do that later if you want).
  • Embeddable
    Choose if the exported video should be embeddable in third-party websites.

It is always possible to change the distribution/privacy and embeddable options later directly on Facebook. For example, you can export a video to Facebook as Secret and publish it to your news feed whenever you want.


If your production is audio-only, we automatically generate a video track from the Cover Image and (possible) Chapter Images.
Alternatively you can select an Audiogram Output File, if you want to add an Audiogram (audio waveform visualization) to your Facebook video - for details please see Auphonic Audiogram Generator.

Auphonic Title and Description metadata fields are exported to Facebook as well.
If you add Speech Recognition to your production, we create an SRT file with the speech recognition results and add it to your Facebook video as captions.
See the example below.

Facebook Video Hosting Example with Audiogram and Automatic Captions

Facebook can be used as a general video hosting service: even if you export videos as Secret, you will get a direct link to the video which can be shared or embedded in any third-party websites. Users without a Facebook account are also able to view these videos.

In the example below, we automatically generate an Audiogram Video for an audio-only production, use our integrated Speech Recognition system to create captions and export the video as Secret to Facebook.
Afterwards it can be embedded directly into this blog post (enable Captions if they don't show up per default) - for details please see How to embed a video:

It is also possible to just use the generated result URL from Auphonic to share the link to your video (also visible to non-Facebook users):
https://www.facebook.com/auphonic/videos/1687244844638091/

Important Note:
Facebook needs some time to process an exported video (up to a few minutes) and the direct video link won't work before the processing is finished - please try again a bit later!
On Facebook Pages, you can see the processing progress in your Video Library.

Conclusion

Facebook has many broadcasting tools to offer and is a perfect addition to Auphonic.
Both systems and our other external services can be used to create automated processing and publishing workflows. Furthermore, the export and import to/from Facebook is also fully supported in the Auphonic API.

Please contact us if you have any questions or further ideas!




on

Auphonic Audio Inspector Release

At the Subscribe 9 Conference, we presented the first version of our new Audio Inspector:
The Auphonic Audio Inspector is shown on the status page of a finished production and displays details about what our algorithms are changing in audio files.

A screenshot of the Auphonic Audio Inspector on the status page of a finished Multitrack Production.
Please click on the screenshot to see it in full resolution!

It is possible to zoom and scroll within audio waveforms and the Audio Inspector might be used to manually check production result and input files.

In this blog post, we will discuss the usage and all current visualizations of the Inspector.
If you just want to try the Auphonic Audio Inspector yourself, take a look at this Multitrack Audio Inspector Example.

Inspector Usage

Control bar of the Audio Inspector with scrollbar, play button, current playback position and length, button to show input audio file(s), zoom in/out, toggle legend and a button to switch to fullscreen mode.

Seek in Audio Files
Click or tap inside the waveform to seek in files. The red playhead will show the current audio position.
Zoom In/Out
Use the zoom buttons ([+] and [-]), the mouse wheel or zoom gestures on touch devices to zoom in/out the audio waveform.
Scroll Waveforms
If zoomed in, use the scrollbar or drag the audio waveform directly (with your mouse or on touch devices).
Show Legend
Click the [?] button to show or hide the Legend, which describes details about the visualizations of the audio waveform.
Show Stats
Use the Show Stats link to display Audio Processing Statistics of a production.
Show Input Track(s)
Click Show Input to show or hide input track(s) of a production: now you can see and listen to input and output files for a detailed comparison. Please click directly on the waveform to switch/unmute a track - muted tracks are grayed out slightly:

Showing four input tracks and the Auphonic output of a multitrack production.

Please click on the fullscreen button (bottom right) to switch to fullscreen mode.
Now the audio tracks use all available screen space to see all waveform details:

A multitrack production with output and all input tracks in fullscreen mode.
Please click on the screenshot to see it in full resolution.

In fullscreen mode, it’s also possible to control playback and zooming with keyboard shortcuts:
Press [Space] to start/pause playback, use [+] to zoom in and [-] to zoom out.

Singletrack Algorithms Inspector

First, we discuss the analysis data of our Singletrack Post Production Algorithms.

The audio levels of output and input files, measured according to the ITU-R BS.1770 specification, are displayed directly as the audio waveform. Click on Show Input to see the input and output file. Only one file is played at a time, click directly on the Input or Output track to unmute a file for playback:

Singletrack Production with opened input file.
See the first Leveler Audio Example to try the audio inspector yourself.

Waveform Segments: Music and Speech (gold, blue)
Music/Speech segments are displayed directly in the audio waveform: Music segments are plotted in gold/yellow, speech segments in blue (or light/dark blue).
Waveform Segments: Leveler High/No Amplification (dark, light blue)
Speech segments can be displayed in normal, dark or light blue: Dark blue means that the input signal was very quiet and contains speech, therefore the Adaptive Leveler has to use a high amplification value in this segment.
In light blue regions, the input signal was very quiet as well, but our classifiers decided that the signal should not be amplified (breathing, noise, background sounds, etc.).

Yellow/orange background segments display leveler fades.

Background Segments: Leveler Fade Up/Down (yellow, orange)
If the volume of an input file changes in a fast way, the Adaptive Leveler volume curve will increase/decrease very fast as well (= fade) and should be placed in speech pauses. Otherwise, if fades are too slow or during active speech, one will hear pumping speech artifacts.
Exact fade regions are plotted as yellow (fade up, volume increase) and orange (fade down, volume decrease) background segments in the audio inspector.

Horizontal red lines display noise and hum reduction profiles.

Horizontal Lines: Noise and Hum Reduction Profiles (red)
Our Noise and Hiss Reduction and Hum Reduction algorithms segment the audio file in regions with different background noise characteristics, which are displayed as red horizontal lines in the audio inspector (top lines for noise reduction, bottom lines for hum reduction).
Then a noise print is extracted in each region and a classifier decides if and how much noise reduction is necessary - this is plotted as a value in dB below the top red line.
The hum base frequency (50Hz or 60Hz) and the strength of all its partials is also classified in each region, the value in Hz above the bottom red line indicates the base frequency and whether hum reduction is necessary or not (no red line).

You can try the singletrack audio inspector yourself with our Leveler, Noise Reduction and Hum Reduction audio examples.

Multitrack Algorithms Inspector

If our Multitrack Post Production Algorithms are used, additional analysis data is shown in the audio inspector.

The audio levels of the output and all input tracks are measured according to the ITU-R BS.1770 specification and are displayed directly as the audio waveform. Click on Show Input to see all the input files with track labels and the output file. Only one file is played at a time, click directly into the track to unmute a file for playback:

Input Tracks: Waveform Segments, Background Segments and Horizontal Lines
Input tracks are displayed below the output file including their track names. The same data as in our Singletrack Algorithms Inspector is calculated and plotted separately in each input track:
Output Waveform Segments: Multiple Speakers and Music
Each speaker is plotted in a separate, blue-like color - in the example above we have 3 speakers (normal, light and dark blue) and you can see directly in the waveform when and which speaker is active.
Audio from music input tracks are always plotted in gold/yellow in the output waveform, please try to not mix music and speech parts in music tracks (see also Multitrack Best Practice)!

You can try the multitrack audio inspector yourself with our Multitrack Audio Inspector Example or our general Multitrack Audio Examples.

Ducking, Background and Foreground Segments

Music tracks can be set to Ducking, Foreground, Background or Auto - for more details please see Automatic Ducking, Foreground and Background Tracks.

Ducking Segments (light, dark orange)
In Ducking, the level of a music track is reduced if one of the speakers is active, which is plotted as a dark orange background segment in the output track.
Foreground music parts, where no speaker is active and the music track volume is not reduced, are displayed as light orange background segments in the output track.
Background Music Segments (dark orange background)
Here the whole music track is set to Background and won’t be amplified when speakers are inactive.
Background music parts are plotted as dark organge background segments in the output track.
Foreground Music Segments (light orange background)
Here the whole music track is set to Foreground and its level won’t be reduced when speakers are active.
Foreground music parts are plotted as light organge background segments in the output track.

You can try the ducking/background/foreground audio inspector yourself: Fore/Background/Ducking Audio Examples.

Audio Search, Chapters Marks and Video

Audio Search and Transcriptions
If our Automatic Speech Recognition Integration is used, a time-aligned transcription text will be shown above the waveform. You can use the search field to search and seek directly in the audio file.
See our Speech Recognition Audio Examples to try it yourself.
Chapters Marks
Chapter Mark start times are displayed in the audio waveform as black vertical lines.
The current chapter title is written above the waveform - see “This is Chapter 2” in the screenshot above.

A video production with output waveform, input waveform and transcriptions in fullscreen mode.
Please click on the screenshot to see it in full resolution.

Video Display
If you add a Video Format or Audiogram Output File to your production, the audio inspector will also show a separate video track in addition to the audio output and input tracks. The video playback will be synced to the audio of output and input tracks.

Supported Audio Formats

We use the native HTML5 audio element for playback and the aurora.js javascript audio decoders to support all common audio formats:

WAV, MP3, AAC/M4A and Opus
These formats are supported in all major browsers: Firefox, Chrome, Safari, Edge, iOS Safari and Chrome for Android.
FLAC
FLAC is supported in Firefox, Chrome, Edge and Chrome for Android - see FLAC audio format.
In Safari and iOS Safari, we use aurora.js to directly decode FLAC files in javascript, which works but uses much more CPU compared to native decoding!
ALAC
ALAC is not supported by any browser so far, therefore we use aurora.js to directly decode ALAC files in javascript. This works but uses much more CPU compared to native decoding!
Ogg Vorbis
Only supported by Firefox, Chrome and Chrome for Android - for details please see Ogg Vorbis audio format.

We suggest to use a recent Firefox or Chrome browser for best performance.
Decoding FLAC and ALAC files also works in Safari and iOS with the help of aurora.js, but javascript decoders need a lot of CPU and they sometimes have problems with exact scrolling and seeking.

Please see our blog post Audio File Formats and Bitrates for Podcasts for more details about audio formats.

Mobile Audio Inspector

Multiple responsive layouts were created to optimize the screen space usage on Android and iOS devices, so that the audio inspector is fully usable on mobile devices as well: tap into the waveform to set the playhead location, scroll horizontally to scroll waveforms, scroll vertically to scroll between tracks, use zoom gestures to zoom in/out, etc.

Unfortunately the fullscreen mode is not available on iOS devices (thanks to Apple), but it works on Android and is a really great way to inspect everything using all the available screen space:

Audio inspector in horizontal fullscreen mode on Android.

Conclusion

Try the Auphonic Audio Inspector yourself: take a look at our Audio Example Page or play with the Multitrack Audio Inspector Example.

The Audio Inspector will be shown in all productions which are created in our Web Service.
It might be used to manually check production result/input files and to send us detailed feedback about audio processing results.

Please let us know if you have some feedback or questions - more visualizations will be added in future!







on

Auphonic Add-ons for Adobe Audition and Adobe Premiere

The new Auphonic Audio Post Production Add-ons for Adobe allows you to use the Auphonic Web Service directly within Adobe Audition and Adobe Premiere (Mac and Windows):

Audition Multitrack Editor with the Auphonic Audio Post Production Add-on.
The Auphonic Add-on can be embedded directly inside the Adobe user interface.


It is possible to export tracks/projects from Audition/Premiere and process them with the Auphonic audio post production algorithms (loudness, leveling, noise reduction - see Audio Examples), use our Encoding/Tagging, Chapter Marks, Speech Recognition and trigger Publishing with one click.
Furthermore, you can import the result file of an Auphonic Production into Audition/Premiere.


Download the Auphonic Audio Post Production Add-ons for Adobe:

Auphonic Add-on for Adobe Audition

Audition Waveform Editor with the Auphonic Audio Post Production Add-on.
Metadata, Marker times and titles will be exported to Auphonic as well.

Export from Audition to Auphonic

You can upload the audio of your current active document (a Multitrack Session or a Single Audio File) to our Web Service.
In case of a Multitrack Session, a mixdown will be computed automatically to create a Singletrack Production in our Web Service.
Unfortunately, it is not possible to export the individual tracks in Audition, which could be used to create Multitrack Productions.

Metadata and Markers
All metadata (see tab Metadata in Audition) and markers (see tab Marker in Audition and the Waveform Editor Screenshot) will be exported to Auphonic as well.
Marker times and titles are used to create Chapter Marks (Enhanced Podcasts) in your Auphonic output files.
Auphonic Presets
You can optionally choose an Auphonic Preset to use previously stored settings for your production.
Start Production and Upload & Edit Buttons
Click Upload & Edit to upload your audio and create a new Production for further editing. After the upload, a web browser will be started to edit/adjust the production and start it manually.
Click Start Production to upload your audio, create a new Production and start it directly without further editing. A web browser will be started to see the results of your production.
Audio Compression
Uncompressed Multitrack Sessions or audio files in Audition (WAV, AIFF, RAW, etc.) will be compressed automatically with lossless codecs to speed up the upload time without a loss in audio quality.
FLAC is used as lossless codec on Windows and Mac OS (>= 10.13), older Mac OS systems (< 10.13) do not support FLAC and use ALAC instead.

Import Auphonic Productions in Audition

To import the result of an Auphonic Production into Audition, choose the corresponding production and click Import.
The result file will be downloaded from the Auphonic servers and can be used within Audition. If the production contains multiple Output File Formats, the output file with the highest bitrate (or uncompressed/lossless if available) will be chosen.

Auphonic Add-on for Adobe Premiere

Premiere Video Editor with the Auphonic Audio Post Production Add-on.
The Auphonic Add-on can be embedded directly inside the Adobe Premiere user interface.

Export from Premiere to Auphonic

You can upload the audio of your current Active Sequence in Premiere to our Web Service.

We will automatically create an audio-only mixdown of all enabled audio tracks in your current Active Sequence.
Video/Image tracks are ignored: no video will be rendered or uploaded to Auphonic!
If you want to export a specific audio track, please just mute the other tracks.

Start Production and Upload & Edit Buttons
Click Upload & Edit to upload your audio and create a new Production for further editing. After the upload, a web browser will be started to edit/adjust the production and start it manually.
Click Start Production to upload your audio, create a new Production and start it directly without further editing. A web browser will be started to see the results of your production.
Auphonic Presets
You can optionally choose an Auphonic Preset to use previously stored settings for your production.
Chapter Markers
Chapter Markers in Premiere (not all the other marker types!) will be exported to Auphonic as well and are used to create Chapter Marks (Enhanced Podcasts) in your Auphonic output files.
Audio Compression
The mixdown of your Active Sequence in Premiere will be compressed automatically with lossless codecs to speed up the upload time without a loss in audio quality.
FLAC is used as lossless codec on Windows and Mac OS (>= 10.13), older Mac OS systems (< 10.13) do not support FLAC and use ALAC instead.

Import Auphonic Productions in Premiere

To import the result of an Auphonic Production into Premiere, choose the corresponding production and click Import.
The result file will be downloaded from the Auphonic servers and can be used within Premiere. If the production contains multiple Output File Formats, the output file with the highest bitrate (or uncompressed/lossless if available) will be chosen.

Installation

Install our Add-ons for Audition and Premiere directly on the Adobe Add-ons website:

Auphonic Audio Post Production for Adobe Audition:
https://exchange.adobe.com/addons/products/20433

Auphonic Audio Post Production for Adobe Premiere:
https://exchange.adobe.com/addons/products/20429

The installation requires the Adobe Creative Cloud desktop application and might take a few minutes. Please also also try to restart Audition/Premiere if the installation does not work (on Windows it was once even necessary to restart the computer to trigger the installation).


After the installation, you can start our Add-ons directly in Audition/Premiere:
navigate to Window -> Extensions and click Auphonic Post Production.

Enjoy

Thanks a lot to Durin Gleaves and Charles Van Winkle from Adobe for their great support!

Please let us know if you have any questions or feedback!







on

New Auphonic Privacy Policy and GDPR Compliance

The new General Data Protection Regulation (GDPR) of the European Union will be implemented on May 25th, 2018. We used this opportunity to rework many of our internal data processing structures, removed unnecessary trackers and apply this strict and transparent regulation also to all our customers worldwide.

Image from pixapay.com.

At Auphonic we store as few personal information as possible about your usage and production data.
Here are a few human-readable excerpts from our privacy policy about which information we collect, how we process it, how long and where we store it - for more details please see our full Privacy Policy.

Information that we collect

  • Your email address when you create an account.
  • Your files, content, configuration parameters and other information, including your photos, audio or video files, production settings, metadata and emails.
  • Your tokens or authentication information if you choose to connect to any External services.
  • Your subscription plan, credits purchases and production billing history associated with your account, where applicable.
  • Your interactions with us, whether by email, on our blog or on our social media platforms.

We do not process any special categories of data (also commonly referred to as “sensitive personal data”).

How we use and process your Data

  • To authenticate you when you log on to your account.
  • To run your Productions, such that Auphonic can create new media files from your Content according to your instructions.
  • To improve our audio processing algorithms. For this purpose, you agree that your Content may be viewed and/or listened to by an Auphonic employee or any person contracted by Auphonic to work on our audio processing algorithms.
  • To connect your Auphonic account to an External service according to your instructions.
  • To develop, improve and optimize the contents, screen layouts and features of our Services.
  • To follow up on any question and request for assistance or information.

When using our Service, you fully retain any rights that you have with regards to your Content, including copyright.

How long we store your Information

Your Productions and any associated audio or video files will be permanently deleted from our servers including all its metadata and possible data from external services after 21 days (7 days for video productions).
We will, however, keep billing metadata associated with your Productions in an internal database (how many hours of audio you processed).

Also, we might store selected audio and/or video files (or excerpts thereof) from your Content in an internal storage space for the purpose of improving our audio processing algorithms.

Other information like Presets, connected External services, Account settings etc. will be stored until you delete them or when your account is deleted.

Where we store your Data

All data that we collect from you is stored on secure servers in the European Economic Area (in Germany).

More Information and Contact

For more information please read our full Privacy Policy.

Please do not hesitate to contact us regarding any matter relating to our privacy policy and GDPR compliance!







on

Codec2: a whole Podcast on a Floppy Disk

In a previous blogpost we talked about the Opus codec, which offers very low bitrates. Another codec seeking to achieve even lower bitrates is Codec 2.

Codec 2 is designed for use with speech only, and although the bitrates are impressive the results aren’t as clear as Opus, as you can hear in the following audio examples. However, there is some interesting work being done with Codec 2 in combination with neural network (WaveNets) that is yielding great results.

Layers of a WaveNet neural network.

Background

Codec 2 is an open source codec designed for speech, and aims for compression rates between 700bps and 3200bps (bits per seconds).

The man behind it, David Rowe, is an electronic engineer currently living in South Australia. He started the project in September 2009, with the main aim of improving low-cost radio communication for people living in remote areas of the world. With this in mind, he set out to develop a codec that would significantly reduce file sizes and the bandwidth required when streaming.

Another motivation according to David, was to be free from patented technologies used by closed source codes which he believes “require expensive and awkward licenses and are stifling innovation”. His belief is that this work can be done without requiring the use of patent protected codecs, so all his work is open source.

Potential Applications

Rowe’s perceived applications include VOIP trunking, voice over low bandwidth HF/VHF digital radio, (especially for amateur radio, so as to avoid issues with the use of proprietary codecs), and developing world and remote area communications, including military, police and emergency services.

Why we’re interested here at Auphonic is for its potential for longer podcasts, presentations and audiobooks, allowing for low storage and minimizing the effect of bad network connections.

How it Works

To achieve the lower rates sought, speech has to be reduced into the smallest possible information/data, and this means that the amount of redundant information that is transmitted has to be minimized.

To do this, Codec 2 uses harmonic sinusoidal speech coding. This splits the speech into 10 - 30ms segments, called frames. Each frame is then analysed for the fundamental frequency (or pitch), and the number of harmonics that fit into a 4Khz bandwidth. Further, for each of the harmonics within the 4khz range, the amplitude and phase are recorded.

This information is then coded, and the decoder reconstructs the audio based on this data.

Codec 2 Block diagrams - Encoder (left) And decoder (right)
Figure from Rowtel.

Audio Examples and Comparison with other Codecs

Whilst it all sounds great in theory, how does the reality match up? Let’s have a listen.

Here is a short wav audio file:

intro-orig.wav - 1.3 MB (download):

Applying Codec 2 (without the WaveNet decoder) at the different rates available, 3200bps, 2400bps,1600bps,1200bps and 700bps, we get:

3200bps (download):
2400bps (download):
1600bps (download):
1200bps (download):
700bps (download):

These examples show significantly reduced file sizes.
Putting that information more meaningfully in terms of how much storage you would need for an hour of audio:

  • At 3200bps, 1 hour of audio requires only 1.37MB (this would fit on one old 3½-inch floppy disk!)
  • A rate of 2400bps equates to 1.03MB/h
  • A rate of 1600bps equates to 0.68MB/h (Or approximately 2 hours of audio on one floppy disk!)
  • A rate of 1200bps equates to 0.51MB/h
  • A rate of 700bps equates to 0.3MB/h

So great compression, but the result is clearly not natural sounding.

As a comparison here is the same audio as a 8kb/s MP3:

MP3 at 8 kb/s - 23kb file size (download):

The file size is significantly larger than Codec 2 and the quality is arguably still not useable. You can clearly hear what is sometimes called sizzle - the weird metallic sounds you hear on low quality MP3s.

There is a final codec which is worth comparing, one that that seems to capture the two ideals of usable quality at low bitrates that we want: Opus.
Because of it's convincing low-bitrate performance, Auphonic already offers Opus encoding all the way down to 6 kbps, the lowest bitrate that Opus supports.

Comparing Opus at this 6 kbps rate to the 8kbps MP3 shows a significant improvement - although slightly muffled, it still sounds natural:

Opus at 6kbps (download):

Returning to Codec 2, and purely as s a bit of fun, here are some samples of Codec 2 on music! (Note that Codec 2 is not designed for music, it was only ever conceived for use on speech).

Original file (download):
As a 8kbps MP3 (download):

I personally couldn’t listen to the MP3 at this rate, so let’s listen to what Codec 2 does!

Codec 2 at different bitrates:

3200bps (download):
2400bps (download):
1600bps (download):
1200bps (download):
700bps (download):

As you can hear, it is not suitable for this application at all!

Codec 2 and WaveNet

As we have heard, despite the impressive bitrates achieved, the end result is not very natural sounding.
However, where it starts to get more interesting is the work done by W. Bastiaan Kleijn from Cornell University Library. He has been using with Codec 2 running at 2400bps on the coding side, but replaced the Codec 2 decoder with a WaveNet deep learning generative model (for more informationsee the paper Wavenet based low rate speech coding).

Here are some samples from the authors:

Codec Male Example
Original File
Codec 2
With WaveNet Decoder
Codec Female Example
Original File
Codec 2
With WaveNet Decoder

Comparing to Codec 2 you can hear a significant increase in quality, and if you compare to the original, there is not a significant decrease in quality.

David Rowe himself has stated that he considers the result to be "a game changer for low bit rate speech coding" and “as good an an 8000bps wideband speech codec”.

Conclusion

Whilst the (original) Codec 2 project represents very interesting work, it is limited, and the end result is not suited for podcasting. Also as we heard in the audio examples, it can only be used for voice recordings, and not music.

However, Codec 2 in combination with a WaveNet decoder improves the quality a lot and the low bitrate (2400bps) would be extremely interesting for podcasts and audiobooks distribution as well: one hour of audio would require only 1.03MB of storage!

Auphonic will add support for Codec 2 output files when the WaveNet decoder is in a usable form. For now we have just added support for Codec 2 input files.







on

New Auphonic Transcript Editor and Improved Speech Recognition Services

Back in late 2016, we introduced Speech Recognition at Auphonic. This allows our users to create transcripts of their recordings, and more usefully, this means podcasts become searchable.
Now we integrated two more speech recognition engines: Amazon Transcribe and Speechmatics. Whilst integrating these services, we also took the opportunity to develop a complete new Transcription Editor:

Screenshot of our Transcript Editor with word confidence highlighting and the edit bar.
Try out the Transcript Editor Examples yourself!


The new Auphonic Transcript Editor is included directly in our HTML transcript output file, displays word confidence values to instantly see which sections should be checked manually, supports direct audio playback, HTML/PDF/WebVTT export and allows you to share the editor with someone else for further editing.

The new services, Amazon Transcribe and Speechmatics, offer transcription quality improvements compared to our other integrated speech recognition services.
They also return word confidence values, timestamps and some punctuation, which is exported to our output files.

The Auphonic Transcript Editor

With the integration of the two new services offering improved recognition quality and word timestamps alongside confidence scores, we realized that we could leverage these improvements to give our users easy-to-use transcription editing.
Therefore we developed a new, open source transcript editor, which is embedded directly in our HTML output file and has been designed to make checking and editing transcripts as easy as possible.

Main features of our transcript editor:
  • Edit the transcription directly in the HTML document.
  • Show/hide word confidence, to instantly see which sections should be checked manually (if you use Amazon Transcribe or Speechmatics as speech recognition engine).
  • Listen to audio playback of specific words directly in the HTML editor.
  • Share the transcript editor with others: as the editor is embedded directly in the HTML file (no external dependencies), you can just send the HTML file to some else to manually check the automatically generated transcription.
  • Export the edited transcript to HTML, PDF or WebVTT.
  • Completely useable on all mobile devices and desktop browsers.

Examples: Try Out the Transcript Editor

Here are two examples of the new transcript editor, taken from our speech recognition audio examples page:

1. Singletrack Transcript Editor Example
Singletrack speech recognition example from the first 10 minutes of Common Sense 309 by Dan Carlin. Speechmatics was used as speech recognition engine without any keywords or further manual editing.
2. Multitrack Transcript Editor Example
A multitrack automatic speech recognition transcript example from the first 20 minutes of TV Eye on Marvel - Luke Cage S1E1. Amazon Transcribe was used as speech recognition engine without any further manual editing.
As this is a multitrack production, the transcript includes exact speaker names as well (try to edit them!).

Transcript Editing

By clicking the Edit Transcript button, a dashed box appears around the text. This indicates that the text is now freely editable on this page. Your changes can be saved by using one of the export options (see below).
If you make a mistake whilst editing, you can simply use the undo/redo function of the browser to undo or redo your changes.


When working with multitrack productions, another helpful feature is the ability to change all speaker names at once throughout the whole transcript just by editing one speaker. Simply click on an instance of a speaker title and change it to the appropriate name, this name will then appear throughout the whole transcript.

Word Confidence Highlighting

Word confidence values are shown visually in the transcript editor, highlighted in shades of red (see screenshot above). The shade of red is dependent on the actual word confidence value: The darker the red, the lower the confidence value. This means you can instantly see which sections you should check/re-work manually to increase the accuracy.

Once you have edited the highlighted text, it will be set to white again, so it’s easy to see which sections still require editing.
Use the button Add/Remove Highlighting to disable/enable word confidence highlighting.

NOTE: Word confidence values are only available in Amazon Transcribe or Speechmatics, not if you use our other integrated speech recognition services!

Audio Playback

The button Activate/Stop Play-on-click allows you to hear the audio playback of the section you click on (by clicking directly on the word in the transcript editor).
This is helpful in allowing you to check the accuracy of certain words by being able to listen to them directly whilst editing, without having to go back and try to find that section within your audio file.

If you use an External Service in your production to export the resulting audio file, we will automatically use the exported file in the transcript editor.
Otherwise we will use the output file generated by Auphonic. Please note that this file is password protected for the current Auphonic user and will be deleted in 21 days.

If no audio file is available in the transcript editor, or cannot be played because of the password protection, you will see the button Add Audio File to add a new audio file for playback.

Export Formats, Save/Share Transcript Editor

Click on the button Export... to see all export and saving/sharing options:

Save/Share Editor
The Save Editor button stores the whole transcript editor with all its current changes into a new HTML file. Use this button to save your changes for further editing or if you want to share your transcript with someone else for manual corrections (as the editor is embedded directly in the HTML file without any external dependencies).
Export HTML / Export PDF / Export WebVTT
Use one of these buttons to export the edited transcript to HTML (for WordPress, Word, etc.), to PDF (via the browser print function) or to WebVTT (so that the edited transcript can be used as subtitles or imported in web audio players of the Podlove Publisher or Podigee).
Every export format is rendered directly in the browser, no server needed.

Amazon Transcribe

The first of the two new services, Amazon Transcribe, offers accurate transcriptions in English and Spanish at low costs, including keywords, word confidence, timestamps, and punctuation.

UPDATE 2019:
Amazon Transcribe offers more languages now - please see Amazon Transcribe Features!

Pricing
The free tier offers 60 minutes of free usage a month for 12 months. After that, it is billed monthly at a rate of $0.0004 per second ($1.44/h).
More information is available at Amazon Transcribe Pricing.
Custom Vocabulary (Keywords) Support
Custom Vocabulary (called Keywords in Auphonic) gives you the ability to expand and customize the speech recognition vocabulary, specific to your case (i.e. product names, domain-specific terminology, or names of individuals).
The same feature is also available in the Google Cloud Speech API.
Timestamps, Word Confidence, and Punctuation
Amazon Transcribe returns a timestamp and confidence value for each word so that you can easily locate the audio in the original recording by searching for the text.
It also adds some punctuation, which is combined with our own punctuation and formatting automatically.

The high-quality (especially in combination with keywords) and low costs of Amazon Transcribe make it attractive, despite only currently supporting two languages.
However, the processing time of Amazon Transcribe is much slower compared to all our other integrated services!

Try it yourself:
Connect your Auphonic account with Amazon Transcribe at our External Services Page.

Speechmatics

Speechmatics offers accurate transcriptions in many languages including word confidence values, timestamps, and punctuation.

Many Languages
Speechmatics’ clear advantage is the sheer number of languages it supports (all major European and some Asiatic languages).
It also has a Global English feature, which supports different English accents during transcription.
Timestamps, Word Confidence, and Punctuation
Like Amazon, Speechmatics creates timestamps, word confidence values, and punctuation.
Pricing
Speechmatics is the most expensive speech recognition service at Auphonic.
Pricing starts at £0.06 per minute of audio and can be purchased in blocks of £10 or £100. This equates to a starting rate of about $4.78/h. Reduced rate of £0.05 per minute ($3.98/h) are available if purchasing £1,000 blocks.
They offer significant discounts for users requiring higher volumes. At this further reduced price point it is a similar cost to the Google Speech API (or lower). If you process a lot of content, you should contact them directly at sales@speechmatics.com and say that you wish to use it with Auphonic.
More information is available at Speechmatics Pricing.

Speechmatics offers high-quality transcripts in many languages. But these features do come at a price, it is the most expensive speech recognition services at Auphonic.

Unfortunately, their existing Custom Dictionary (keywords) feature, which would further improve the results, is not available in the Speechmatics API yet.

Try it yourself:
Connect your Auphonic account with Speechmatics at our External Services Page.

What do you think?

Any feedback about the new speech recognition services, especially about the recognition quality in various languages, is highly appreciated.

We would also like to hear any comments you have on the transcript editor particularly - is there anything missing, or anything that could be implemented better?
Please let us know!






on

Audio Manipulations and Dynamic Ad Insertion with the Auphonic API

We are pleased to announce a new Audio Inserts feature in the Auphonic API: audio inserts are separate audio files (like intros/outros), which will be inserted into your production at a defined offset.
This blog post shows how one can use this feature for Dynamic Ad Insertion and discusses other Audio Manipulation Methods of the Auphonic API.

API-only Feature

For the general podcasting hobbyist, or even for someone producing a regular podcast, the features that are accessible via our web interface are more than sufficient.

However, some of our users, like podcasting companies who integrate our services as part of their products, asked us for dynamic ad insertions. We teamed up with them to develop a way of making this work within the Auphonic API.

We are pleased therefore to announce audio inserts - a new feature that has been made part of our API. This feature is not available through the web interface though, it requires the use of our API.

Before we talk about audio inserts, let's talk about what you need to know about dynamic ad insertion!

Dynamic Ad Insertion

There are two ways of dealing with adverts within podcasts. In the first, adverts are recorded or edited into the podcast and are fixed, or baked in. The second method is to use dynamic insertion, whereby the adverts are not part of the podcast recording/file but can be inserted into the podcast afterwards, at any time.

This second approach would allow you to run new ad campaigns across your entire catalog of shows. As a podcaster this allows you to potentially generate new revenue from your old content.

As a hosting company, dynamic ad insertion allows you to choose up to date and relevant adverts across all the podcasts you host. You can make these adverts relevant by subject or location, for instance.

Your users can define the time for the ads and their podcast episode, you are then in control of the adverts you insert.

Audio Inserts in Auphonic

Whichever approach to adverts you are taking, using audio inserts can help you.

Audio inserts are separate audio files which will be inserted into your main single or multitrack production at your defined offset (in seconds).

When a separate audio file is inserted as part of your production, it creates a gap in the podcast audio file, shifting the audio back by the length of the insert. Helpfully, chapters and other time-based information like transcriptions are also shifted back when an insert is used.

The biggest advantage of this is that Auphonic will apply loudness normalization to the audio insert so, from an audio point of view, it matches the rest of the podcast.

Although created with dynamic ad insertion in mind, this feature can be used for any type of audio inserts: adverts, music songs, individual parts of a recording, etc. In the case of baked-in adverts, you could upload your already processed advert audio as an insert, without having to edit it into your podcast recording using a separate audio editing application.

Please note that audio inserts should already be edited and processed before using them in production. (This is usually the case with pre-recorded adverts anyway). The only algorithm that Auphonic applies to an audio insert is loudness normalization in order to match the loudness of the entire production. Auphonic does not add any other processing (i.e. no leveling, noise reduction etc).

Audio Inserts Coding Example

Here is a brief overview of how to use our API for audio inserts. Be warned, this section is coding heavy, so if this isn't your thing, feel free to move along to the next section!

You can add audio insert files with a call to https://auphonic.com/api/production/{uuid}/multi_input_files.json, where uuid is the UUID of your production.
Here is an example with two audio inserts from an https URL. The offset/position in the main audio file must be given in seconds:

curl -X POST -H "Content-Type: application/json" 
    https://auphonic.com/api/production/{uuid}/multi_input_files.json 
    -u username:password 
    -d '[
            {
                "input_file": "https://mydomain.com/my_audio_insert_1.wav",
                "type": "insert",
                "offset": 20.5
            },
            {
                "input_file": "https://mydomain.com/my_audio_insert_2.wav",
                "type": "insert",
                "offset": 120.3
            }
        ]'

More details showing how to use audio inserts in our API can be seen here.

Additional API Audio Manipulations

In addition to audio inserts, using the Auphonic API offers a number of other audio manipulation options, which are not available via the web interface:

Cut start/end of audio files: See Docs
In Single-track productions, this feature allows the user to cut the start and/or the end of the uploaded audio file. Crucially, time-based information such as chapters etc. will be shifted accordingly.
Fade In/Out time of audio files: See Docs
This allows you to set the fade in/out time (in ms) at the start/end of output files. The default fade time is 100ms, but values can be set between 0ms and 5000ms.
This feature is also available in our Auphonic Leveler Desktop App.
Adding intro and outro: See Docs
Automatically add intros and outros to your main audio input file, as it is also available in our web interface.
Add multiple intros or outros: See Docs
Using our API, you can also add multiple intros or outros to a production. These intros or outros are played in series.
Overlapping intros/outros: See Docs
This feature allows intros/outros to overlap either the main audio or the following/previous intros/outros.

Conclusion

If you haven't explored our API already, the new audio inserts feature allows for greater flexibility and also dynamic ad insertion.
If you offer online services to podcasters, the Auphonic API would also then allow you to pass on Auphonic's audio processing algorithms to your customers.

If this is of interest to you or you have any new feature suggestions that you feel could benefit your company, please get in touch. We are always happy to extend the functionality of our products!







on

Resumable File Uploads to Auphonic

Large file uploads in a web browser are problematic, even in 2018. If working with a poor network connection, uploads can fail and have to be retried from the start.

At Auphonic, our users have to upload large audio and video files, or multiple media files when creating a multitrack production. To minimize any potential issues, we integrated various external services which are specialized for large file transfers, like FTP, SFTP, Dropbox, Google Drive, S3, etc.

To further minimize issues, as of today we have also released resumable and chunked direct file uploads in the web browser to auphonic.com.

If you are not interested in the technical details, please just go to the section Resumable Uploads in Auphonic below.

The Problem with Large File Uploads in the Browser

If using either mobile networks (which remain fragile) or unstable WiFi connections, file uploads are often interrupted and will fail. There are also many areas in the world where connections are quite poor, which makes uploading big files frustrating.

After an interrupted file upload, the web browser must restart the whole upload from the start, which is a problem when it happens in the middle of a 4GB video file upload on a slow connection.
Furthermore, the longer an upload takes, the more likely it is to have a network glitch interrupting the upload, which then has to be retried from the start.

The Solution: Chunked, Resumable Uploads

To avoid user frustration, we need to be able to detect network errors and potentially resume an upload without having to restart it from the beginning.

To achieve this, we have to split a file upload in smaller chunks directly within the web browser, so that these chunks can then be sent to the server afterwards.
If an upload fails or the user wants to pause, it is possible to resume it later and only send those chunks that have not already been uploaded.
If there is a network interruption or change, the upload will be retried automatically.

Companies like Dropbox, Google, Amazon AWS etc. all have their own protocols and API's for chunked uploads, but there are also some open source implementations available, which offer resumable uploads:

resumable.js [link]:
"A JavaScript library providing multiple simultaneous, stable and resumable uploads via the HTML5 File API"
This solutions is a JavaScript library only and requires that the protocol is implemented on the server as well.
tus.io [link]:
"Open Protocol for Resumable File Uploads"
Tus.io offers a simple, cheap and reusable stack for clients and servers (in many languages). They have a blog with further information about resumable uploads, see tus blog.
plupload [link]:
A JavaScript library, similar to resumable.js, which requires a separate server implementation.

We chose to use resumable.js and developed our own server implementation.

Resumable Uploads in Auphonic

If you upload files to a singletrack or multitrack production, you will see the upload progress bar and a pause button, which is one way to pause and resume an upload:

It is also possible to close the browser completely or shut down your computer during the upload, then edit the production and upload the file again later. This will just resume the file upload from the position where it was stopped before.
(Previously uploaded chunks are saved for 24h on our servers, after that you have to start the whole upload again.)

In case of a network problem or if you switch to a different connection, we will resume the upload automatically.
This should solve many problems which were reported by some users in the past!

You can of course also use any of our external services for stable incoming and outgoing file transfers!

Do you still have Uploading Issues?

We hope that uploads to Auphonic are much more reliable now, even on poor connections.

If you still experience any problems, please let us know.
We are very happy about any bug reports and will do our best to fix them!







on

Auphonic Adaptive Leveler Customization (Beta Update)

In late August, we launched the private beta program of our advanced audio algorithm parameters. After feedback by our users and many new experiments, we are proud to release a complete rework of the Adaptive Leveler parameters:

In the previous version, we based our Adaptive Leveler parameters on the Loudness Range descriptor (LRA), which is included in the EBU R128 specification.
Although it worked, it turned out that it is very difficult to set a loudness range target for diverse audio content, which does include speech, background sounds, music parts, etc. The results were not predictable and it was hard to find good target values.
Therefore we developed our own algorithm to measure the dynamic range of audio signals, which works similarly for speech, music and other audio content.

The following advanced parameters for our Adaptive Leveler allow you to customize which parts of the audio should be leveled (foreground, all, speech, music, etc.), how much they should be leveled (dynamic range), and how much micro-dynamics compression should be applied.

To try out the new algorithms, please join our private beta program and let us know your feedback!

Leveler Preset

The Leveler Preset defines which parts of the audio should be adjusted by our Adaptive Leveler:

  • Default Leveler:
    Our classic, default leveling algorithm as demonstrated in the Leveler Audio Examples. Use it if you are unsure.
  • Foreground Only Leveler:
    This preset reacts slower and levels foreground parts only. Use it if you have background speech or background music, which should not be amplified.
  • Fast Leveler:
    A preset which reacts much faster. It is built for recordings with fast and extreme loudness differences, for example, to amplify very quiet questions from the audience in a lecture recording, to balance fast-changing soft and loud voices within one audio track, etc.
  • Amplify Everything:
    Amplify as much as possible. Similar to the Fast Leveler, but also amplifies non-speech background sounds like noise.

Leveler Dynamic Range

Our default Leveler tries to normalize all speakers to a similar loudness so that a consumer in a car or subway doesn't feel the need to reach for the volume control.
However, in other environments (living room, cinema, etc.) or in dynamic recordings, you might want more level differences (Dynamic Range, Loudness Range / LRA) between speakers and within music segments.

The parameter Dynamic Range controls how much leveling is applied: Higher values result in more dynamic output audio files (less leveling). If you want to increase the dynamic range by 3dB (or LU), just increase the Dynamic Range parameter by 3dB.
We also like to call this Loudness Comfort Zone: above a maximum and below a minimum possible level (the comfort zone), no leveling is applied. So if your input file already has a small dynamic range (is within the comfort zone), our leveler will be just bypassed.

Example Use Cases:
Higher dynamic range values should be used if you want to keep more loudness differences in dynamic narration or dynamic music recordings (live concert/classical).
It is also possible to utilize this parameter to generate automatic mixdowns with different loudness range (LRA) values for different target environments (very compressed ones like mobile devices or Alexa, very dynamic ones like home cinema, etc.).

Compressor

Controls Micro-Dynamics Compression:
The compressor reduces the volume of short and loud spikes like "p", "t" or laughter ( short-term dynamics) and also shapes the sound of your voice (it will sound more or less "processed").
The Leveler, on the other hand, adjusts mid-term level differences, as done by a sound engineer, using the faders of an audio mixer, so that a listener doesn't have to adjust the playback volume all the time.
For more details please see Loudness Normalization and Compression of Podcasts and Speech Audio.

Possible values are:
  • Auto:
    The compressor setting depends on the selected Leveler Preset. Medium compression is used in Foreground Only and Default Leveler presets, Hard compression in our Fast Leveler and Amplify Everything presets.
  • Soft:
    Uses less compression.
  • Medium:
    Our default setting.
  • Hard:
    More compression, especially tries to compress short and extreme level overshoots. Use this preset if you want your voice to sound very processed, our if you have extreme and fast-changing level differences.
  • Off:
    No short-term dynamics compression is used at all, only mid-term leveling. Switch off the compressor if you just want to adjust the loudness range without any additional micro-dynamics compression.

Separate Music/Speech Parameters

Use the switch Separate MusicSpeech Parameters (top right), to see separate Adaptive Leveler parameters for music and speech segments, to control all leveling details separately for speech and music parts:

For dialog intelligibility improvements in films and TV, it is important that the speech/dialog level and loudness range is not too soft compared to the overall programme level and loudness range. This parameter allows you to use more leveling in speech parts while keeping music and FX elements less processed.
Note: Speech, music and overall loudness and loudness range of your production are also displayed in our Audio Processing Statistics!

Example Use Case:
Music live recordings or dynamic music mixes, where you want to amplify all speakers (speech dynamic range should be small) but keep the dynamic range within and between music segments (music dynamic range should be high).
Dialog intelligibility improvements for films and TV, without effecting music and FX elements.

Other Advanced Audio Algorithm Parameters

We also offer advanced audio parameters for our Noise, Hum Reduction and Global Loudness Normalization algorithms:

For more details, please see the Advanced Audio Algorithms Documentation.

Want to know more?

If you want to know more details about our advanced algorithm parameters (especially the leveler parameters), please listen to the following podcast interview with Chris Curran (Podcast Engineering School):
Auphonic’s New Advanced Features, with Georg Holzmann – PES 108

Advanced Parameters Private Beta and Feedback

At the moment the advanced algorithm parameters are for beta users only. This is to allow us to get user feedback, so we can change the parameters to suit user needs.
Please let us know your case studies, if you need any other algorithm parameters or if you have any questions!

Here are some private beta invitation codes:

jbwCVpLYrl 6zmLqq8o3z RXYIUbC6al QDmIZLuPKa JIrnGRZBgl SWQOWeZOBD ISeBCA9gTy w5FdsyhZVI qWAvANQ5mC twOjdHrit3
KwnL2Le6jB 63SE2V54KK G32AULFyaM 3H0CLYAwLU mp1GFNVZHr swzvEBRCVa rLcNJHUNZT CGGbL0O4q1 5o5dUjruJ9 hAggWBpGvj
ykJ57cFQSe 0OHAD2u1Dx RG4wSYTLbf UcsSYI78Md Xedr3NPCgK mI8gd7eDvO 0Au4gpUDJB mYLkvKYz1C ukrKoW5hoy S34sraR0BU
J2tlV0yNwX QwNdnStYD3 Zho9oZR2e9 jHdjgUq420 51zLbV09p4 c0cth0abCf 3iVBKHVKXU BK4kTbDQzt uTBEkMnSPv tg6cJtsMrZ
BdB8gFyhRg wBsLHg90GG EYwxVUZJGp HLQ72b65uH NNd415ktFS JIm2eTkxMX EV2C5RAUXI a3iwbxWjKj X1AT7DCD7V y0AFIrWo5l
We are happy to send further invitation codes to all interested users - please do not hesitate to contact us!

If you have an invitation code, you can enter it here to activate the advanced audio algorithm parameters:
Auphonic Algorithm Parameters Private Beta Activation







on

More Languages for Amazon Transcribe Speech Recognition

Until recently, Amazon Transcribe supported speech recognition in English and Spanish only.
Now they included French, Italian and Portuguese as well - and a few other languages (including German) are in private beta.

Update March 2019:
Now Amazon Transcribe supports German and Korean as well.

The Auphonic Audio Inspector on the status page of a finished Multitrack Production including speech recognition.
Please click on the screenshot to see it in full resolution!


Amazon Transcribe is integrated as speech recognition engine within Auphonic and offers accurate transcriptions (compared to other services) at low costs, including keywords / custom vocabulary support, word confidence, timestamps, and punctuation.
See the following AWS blog post and video for more information about recent Amazon Transcribe developments: Transcribe speech in three new languages: French, Italian, and Brazilian Portuguese.

Amazon Transcribe is also a perfect fit if you want to use our Transcript Editor because you will be able to see word timestamps and confidence values to instantly check which section/words should be corrected manually to increase the transcription accuracy:


Screenshot of our Transcript Editor with word confidence highlighting and the edit bar.

These features are also available if you use Speechmatics, but unfortunately not in our other integrated speech recognition services.

About Speech Recognition within Auphonic

Auphonic has built a layer on top of a few external speech recognition services to make audio searchable:
Our classifiers generate metadata during the analysis of an audio signal (music segments, silence, multiple speakers, etc.) to divide the audio file into small and meaningful segments, which are processed by the speech recognition engine. The results from all segments are then combined, and meaningful timestamps, simple punctuation and structuring are added to the resulting text.

To learn more about speech recognition within Auphonic, take a look at our Speech Recognition and Transcript Editor help pages or listen to our Speech Recognition Audio Examples.

A comparison table of our integrated services (price, quality, languages, speed, features, etc.) can be found here: Speech Recognition Services Comparison.

Conclusion

We hope that Amazon and others will continue to add new languages, to get accurate and inexpensive automatic speech recognition in many languages.

Don't hesitate to contact us if you have any questions or feedback about speech recognition or our transcript editor!






on

Dynamic Range Processing in Audio Post Production

If listeners find themselves using the volume up and down buttons a lot, level differences within your podcast or audio file are too big.
In this article, we are discussing why audio dynamic range processing (or leveling) is more important than loudness normalization, why it depends on factors like the listening environment and the individual character of the content, and why the loudness range descriptor (LRA) is only reliable for speech programs.

Photo by Alexey Ruban.

Why loudness normalization is not enough

Everybody who has lived in an apartment building knows the problem: you want to enjoy a movie late at night, but you're constantly on the edge - not only because of the thrilling story, but because your index finger is hovering over the volume down button of your remote. The next loud sound effect is going to come sooner rather than later, and you want to avoid waking up your neighbors with some gunshot sounds blasting from your TV.

In our previous post, we talked about the overall loudness of a production. While that's certainly important to keep in mind, the loudness target is only an average value, ignoring how much the loudness varies within a production. The loudness target of your movie might be in the ideal range, yet the level differences between a gunshot and someone whispering can still be enormous - having you turn the volume down for the former and up for the latter.

While the average loudness might be perfect, level differences can lead to an unpleasant listening experience.

Of course, this doesn't apply to movies alone. The image above shows a podcast or radio production. The loud section is music, the very quiet section just breathing, and the remaining sections are different voices.

To be clear, we're not saying that the above example is problematic per se. There are many situations, where a big difference in levels - a high dynamic range - is justified: for instance, in a movie theater, optimized for listening and without any outside noise, or in classical music.
Also, if the dynamic range is too small, listening can be tiring.

But if you watch the same movie in an outdoor screening in the summer on a beach next to the crashing waves or in the middle of a noisy city, it can be tricky to hear the softer parts.
Spoken word usually has a smaller dynamic range, and if you produce your podcast for a target audience of train or car commuters, the dynamic range should be even smaller, adjusting for the listening situation.

Therefore, hitting the loudness target has less impact on the listening experience than level differences (dynamic range) within one file!
What makes a suitable dynamic range does not only depend on the listening environment, but also on the nature of the content itself. If the dynamic range is too small, the audio can be tiring to listen to, whereas more variability in levels can make a program more interesting, but might not work in all environments, such as a noisy car.

Dynamic range experiment in a car

Wolfgang Rein, audio technician at SWR, a public broadcaster in Germany, did an experiment to test how drivers react to programs with different dynamic ranges. They monitored to what level drivers set the car stereo depending on speed (thus noise level) and audio dynamic range.
While the results are preliminary, it seems like drivers set the volume as low as possible so that they can still understand the content, but don't get distracted by loud sounds.

As drivers adjust the volume to the loudest voice in a program, they won't understand quieter speakers in content with a high dynamic range anymore. To some degree and for short periods of time, they can compensate by focusing more on the radio program, but over time that's tiring. Therefore, if the loudness varies too much, drivers tend to switch to another program rather than adjusting the volume.
Similar results have been found in a study conducted by NPR Labs and Towson University.

On the other hand, the perception was different in pure music programs. When drivers set the volume according to louder parts, they weren't able to hear softer segments or the beginning of a song very well. But that did not matter to them as much and didn't make them want to turn up the volume or switch the program.

Listener's reaction in response to frequent loudness changes. (from John Kean, Eli Johnson, Dr. Ellyn Sheffield: Study of Audio Loudness Range for Consumers in Various Listening Modes and Ambient Noise Levels)

Loudness comfort zone

The reaction of drivers to variable loudness hints at something that BBC sound engineer Mike Thornton calls the loudness comfort zone.

Tests (...) have shown that if the short-term loudness stays within the "comfort zone" then the consumer doesn’t feel the need to reach for the remote control to adjust the volume.
In a blog post, he highlights how the series Blue Planet 2 and Planet Earth 2 might not always have been the easiest to listen to. The graph below shows an excerpt with very loud music, followed by commentary just at the bottom of the green comfort zone. Thornton writes: "with the volume set at a level that was comfortable when the music was playing we couldn’t always hear the excellent commentary from Sir David Attenborough and had to resort to turning on the subtitles to be sure we knew what Sir David was saying!"

Planet Earth 2 Loudness Plot Excerpt. Colored green: comfort zone of +3 to -5LU around the loudness target. (from Mike Thornton: BBC Blue Planet 2 Latest Show In Firing Line For Sound Issues - Are They Right?)

As already mentioned above, a good mix considers the maximum and minimum possible loudness in the target listening environment.
In a movie theater the loudness comfort zone is big (loudness can vary a lot), and loud music is part of the fun, while quiet scenes work just as well. The opposite was true in the aforementioned experiment with drivers, where the loudness comfort zone is much smaller and quiet voices are difficult to understand.

Hence, the loudness comfort zone determines how much dynamic range an audio signal can use in a specific listening environment.

How to measure dynamic range: LRA

When producing audio for various environments, it would be great to have a target value for dynamic range, (the difference between the smallest and largest signal values of an audio signal) as well. Then you could just set a dynamic range target, similarly to a loudness target.

Theoretically, the maximum possible dynamic range of a production is defined by the bit-depth of the audio format. A 16-bit recording can have a dynamic range of 96 dB; for 24-bit, it's 144 dB - which is well above the approx. 120 dB the human ear can handle. However, most of those bits are typically being used to get to a reasonable base volume. Picture a glass of water: you want it to be almost full, with some headroom so that it doesn't spill when there's a sudden movement, i.e. a bigger amplitude wave at the top.

Determining the dynamic range of a production is easier said than done, though. It depends on which signals are included in the measurement: for example, if something like background music or breathing should be considered at all.
The currently preferred method for broadcasting is called Loudness Range, LRA. It is measured in Loudness Units (LU), and takes into account everything between the 10th and the 95th percentile of a loudness distribution, after an additional gating method. In other words, the loudest 5% and quietest 10% of the audio signal are being ignored. This way, quiet breathing or an occasional loud sound effect won't affect the measurement.

Loudness distribution and LRA for the film 'The Matrix'. Figure from EBU Tech Doc 3343 (p.13).

However, the main difficulty is which signals should be included in the loudness range measurement and which ones should be gated. This is unfortunately often very subjective and difficult to define with a purely statistical method like LRA.

Where LRA falls short

Therefore, only pure speech programs give reliable LRA values that are comparable!
For instance, a typical LRA for news programs is 3 LU; for talks and discussions 5 LU is common. LRA values for features, radio dramas, movies or music very much depend on the individual character and might be in the range between 5 and 25 LU.

To further illustrate this, here are some typical LRA values, according to a paper by Thomas Lund (table 2):

ProgramLoudness Range
Matrix, full movie25.0
NBC Interstitials, Jan. 2008, all together (3:30)9.4
Friends Episode 166.6
Speak Ref., Male, German, SQUAM Trk 546.2
Speak Ref., Female, French, SQUAM Trk 514.8
Speak Ref., Male, English, Sound Check3.3
Wish You Were Here, Pink Floyd22.1
Gilgamesh, Battle of Titans, Osaka Symph.19.7
Don’t Cry For Me Arg., Sinead O’Conner13.7
Beethoven Son in F, Op17, Kliegel & Tichman12.0
Rock’n Roll Train, AC/DC6.0
I.G.Y., Donald Fagen3.6

LRA values of music are very unpredictable as well.
For instance, Tom Frampton measured the LRA of songs in multiple genres, and the differences within each genre are quite big. The ten pop songs that he analyzed varied in LRA between 3.7 and 12 LU, country songs between 3.6 and 14.9 LU. In the Electronic genre the individual LRAs were between 3.7 and 15.2 LU. Please see the tables at the bottom of his blog post for more details.

We at Auphonic also tried to base our Adaptive Leveler parameters on the LRA descriptor. Although it worked, it turned out that it is very difficult to set a loudness range target for diverse audio content, which does include speech, background sounds, music parts, etc. The results were not predictable and it was hard to find good target values. Therefore we developed our own algorithm to measure the dynamic range of audio signals.

In conclusion, LRA comparisons are only useful for productions with spoken word only and the LRA value is therefore not applicable as a general dynamic range target value. The more complex a production gets, the more difficult it is to make any judgment based on the LRA.
This is, because the definition of LRA is purely statistical. There's no smart measurement using classifiers that distinguish between music, speech, quiet breathing, background noises and other types of audio. One would need a more intelligent algorithm (as we use in our Adaptive Leveler), that knows which audio segments should be included and excluded from the measurement.

From theory to application: tools

Loudness and dynamic range clearly is a complicated topic. Luckily, there are tools that can help. To keep short-term loudness in range, a compressor can help control sudden changes in loudness - such as p-pops or consonants like t or k. To achieve a good mid-term loudness, i.e. a signal that doesn't go outside the comfort zone too much, a leveler is a good option. Or, just use a fader or manually adjust volume curves. And to make sure that separate productions sound consistent, loudness normalization is the way to go. We have covered all of this in-depth before.

Looking at the audio from above again, with an adaptive leveler applied it looks like this:

Leveler example. Output at the top, input with leveler envelope at the bottom.

Now, the voices are evened out and the music is at a comfortable level, while the breathing has not been touched at all.
We recently extended Auphonic's adaptive leveler, so that it is now possible to customize the dynamic range - please see adaptive leveler customization and advanced multitrack audio algorithms.
If you wanted to increase the loudness comfort zone (or dynamic range) of the standard preset by 10 dB (or LU), for example, the envelope would look like this:

Leveler with higher dynamic range, only touching sections with extremely low or extremely high loudness to fit into a specific loudness comfort zone.

When a production is done, our adaptive leveler uses classifiers to also calculate the integrated loudness and loudness range of dialog and music sections separately. This way it is possible to just compare the dialog LRA and loudness of complex productions.

Assessing the LRA and loudness of dialog and music separately.

Conclusion

Getting audio dynamics right is not easy. Yet, it is an important thing to keep in mind, because focusing on loudness normalization alone is not enough. In fact, hitting the loudness target often has less impact on the listening experience than level differences, i.e. audio dynamics.

If the dynamic range is too small, the audio can be tiring to listen to, whereas a bigger dynamic range can make a program more interesting, but might not work in loud environments, such as a noisy train.
Therefore, a good mix adapts the audio dynamic range according to the target listening environment (different loudness comfort zones in cinema, at home, in a car) and according to the nature of the content (radio feature, movie, podcast, music, etc.).

Furthermore, because the definition of the loudness range / LRA is purely statistical, only speech programs give reliable LRA values that are comparable.
More "intelligent" algorithms are in development, which use classifiers to decide which signals should be included and excluded from the dynamic range measurement.

If you understand German, take a look at our presentation about audio dynamic processing in podcasts for further information:







on

Horizontal or/and Vertical Format in Kayak Photography

Like most paddlers I have a tendency to shoot pictures in a horizontal (landscape) format. It is more tricky to shoot in a vertical format from my tippy kayaks, especially, when I have to use a paddle to stabilize my camera.




on

Winter Stand Up Paddling on Horsetooth Reservoir

I love paddling on the Horsetooth Reservoir in cold season. Boat ramps are closed, no power boat traffic, usually quiet and calm. Snow and ice can enhance scenery. A great time to paddle, train, relax or photograph. The Horsetooth stays […]




on

Concurrency & Multithreading in iOS

Concurrency is the notion of multiple things happening at the same time. This is generally achieved either via time-slicing, or truly in parallel if multiple CPU cores are available to the host operating system. We've all experienced a lack of concurrency, most likely in the form of an app freezing up when running a heavy task. UI freezes don't necessarily occur due to the absence of concurrency — they could just be symptoms of buggy software — but software that doesn't take advantage of all the computational power at its disposal is going to create these freezes whenever it needs to do something resource-intensive. If you've profiled an app hanging in this way, you'll probably see a report that looks like this:

Anything related to file I/O, data processing, or networking usually warrants a background task (unless you have a very compelling excuse to halt the entire program). There aren't many reasons that these tasks should block your user from interacting with the rest of your application. Consider how much better the user experience of your app could be if instead, the profiler reported something like this:

Analyzing an image, processing a document or a piece of audio, or writing a sizeable chunk of data to disk are examples of tasks that could benefit greatly from being delegated to background threads. Let's dig into how we can enforce such behavior into our iOS applications.


A Brief History

In the olden days, the maximum amount of work per CPU cycle that a computer could perform was determined by the clock speed. As processor designs became more compact, heat and physical constraints started becoming limiting factors for higher clock speeds. Consequentially, chip manufacturers started adding additional processor cores on each chip in order to increase total performance. By increasing the number of cores, a single chip could execute more CPU instructions per cycle without increasing its speed, size, or thermal output. There's just one problem...

How can we take advantage of these extra cores? Multithreading.

Multithreading is an implementation handled by the host operating system to allow the creation and usage of n amount of threads. Its main purpose is to provide simultaneous execution of two or more parts of a program to utilize all available CPU time. Multithreading is a powerful technique to have in a programmer's toolbelt, but it comes with its own set of responsibilities. A common misconception is that multithreading requires a multi-core processor, but this isn't the case — single-core CPUs are perfectly capable of working on many threads, but we'll take a look in a bit as to why threading is a problem in the first place. Before we dive in, let's look at the nuances of what concurrency and parallelism mean using a simple diagram:

In the first situation presented above, we observe that tasks can run concurrently, but not in parallel. This is similar to having multiple conversations in a chatroom, and interleaving (context-switching) between them, but never truly conversing with two people at the same time. This is what we call concurrency. It is the illusion of multiple things happening at the same time when in reality, they're switching very quickly. Concurrency is about dealing with lots of things at the same time. Contrast this with the parallelism model, in which both tasks run simultaneously. Both execution models exhibit multithreading, which is the involvement of multiple threads working towards one common goal. Multithreading is a generalized technique for introducing a combination of concurrency and parallelism into your program.


The Burden of Threads

A modern multitasking operating system like iOS has hundreds of programs (or processes) running at any given moment. However, most of these programs are either system daemons or background processes that have very low memory footprint, so what is really needed is a way for individual applications to make use of the extra cores available. An application (process) can have many threads (sub-processes) operating on shared memory. Our goal is to be able to control these threads and use them to our advantage.

Historically, introducing concurrency to an app has required the creation of one or more threads. Threads are low-level constructs that need to be managed manually. A quick skim through Apple's Threaded Programming Guide is all it takes to see how much complexity threaded code adds to a codebase. In addition to building an app, the developer has to:

  • Responsibly create new threads, adjusting that number dynamically as system conditions change
  • Manage them carefully, deallocating them from memory once they have finished executing
  • Leverage synchronization mechanisms like mutexes, locks, and semaphores to orchestrate resource access between threads, adding even more overhead to application code
  • Mitigate risks associated with coding an application that assumes most of the costs associated with creating and maintaining any threads it uses, and not the host OS

This is unfortunate, as it adds enormous levels of complexity and risk without any guarantees of improved performance.


Grand Central Dispatch

iOS takes an asynchronous approach to solving the concurrency problem of managing threads. Asynchronous functions are common in most programming environments, and are often used to initiate tasks that might take a long time, like reading a file from the disk, or downloading a file from the web. When invoked, an asynchronous function executes some work behind the scenes to start a background task, but returns immediately, regardless of how long the original task might takes to actually complete.

A core technology that iOS provides for starting tasks asynchronously is Grand Central Dispatch (or GCD for short). GCD abstracts away thread management code and moves it down to the system level, exposing a light API to define tasks and execute them on an appropriate dispatch queue. GCD takes care of all thread management and scheduling, providing a holistic approach to task management and execution, while also providing better efficiency than traditional threads.

Let's take a look at the main components of GCD:

What've we got here? Let's start from the left:

  • DispatchQueue.main: The main thread, or the UI thread, is backed by a single serial queue. All tasks are executed in succession, so it is guaranteed that the order of execution is preserved. It is crucial that you ensure all UI updates are designated to this queue, and that you never run any blocking tasks on it. We want to ensure that the app's run loop (called CFRunLoop) is never blocked in order to maintain the highest framerate. Subsequently, the main queue has the highest priority, and any tasks pushed onto this queue will get executed immediately.
  • DispatchQueue.global: A set of global concurrent queues, each of which manage their own pool of threads. Depending on the priority of your task, you can specify which specific queue to execute your task on, although you should resort to using default most of the time. Because tasks on these queues are executed concurrently, it doesn't guarantee preservation of the order in which tasks were queued.

Notice how we're not dealing with individual threads anymore? We're dealing with queues which manage a pool of threads internally, and you will shortly see why queues are a much more sustainable approach to multhreading.

Serial Queues: The Main Thread

As an exercise, let's look at a snippet of code below, which gets fired when the user presses a button in the app. The expensive compute function can be anything. Let's pretend it is post-processing an image stored on the device.

import UIKit

class ViewController: UIViewController {
    @IBAction func handleTap(_ sender: Any) {
        compute()
    }

    private func compute() -> Void {
        // Pretending to post-process a large image.
        var counter = 0
        for _ in 0..<9999999 {
            counter += 1
        }
    }
}

At first glance, this may look harmless, but if you run this inside of a real app, the UI will freeze completely until the loop is terminated, which will take... a while. We can prove it by profiling this task in Instruments. You can fire up the Time Profiler module of Instruments by going to Xcode > Open Developer Tool > Instruments in Xcode's menu options. Let's look at the Threads module of the profiler and see where the CPU usage is highest.

We can see that the Main Thread is clearly at 100% capacity for almost 5 seconds. That's a non-trivial amount of time to block the UI. Looking at the call tree below the chart, we can see that the Main Thread is at 99.9% capacity for 4.43 seconds! Given that a serial queue works in a FIFO manner, tasks will always complete in the order in which they were inserted. Clearly the compute() method is the culprit here. Can you imagine clicking a button just to have the UI freeze up on you for that long?

Background Threads

How can we make this better? DispatchQueue.global() to the rescue! This is where background threads come in. Referring to the GCD architecture diagram above, we can see that anything that is not the Main Thread is a background thread in iOS. They can run alongside the Main Thread, leaving it fully unoccupied and ready to handle other UI events like scrolling, responding to user events, animating etc. Let's make a small change to our button click handler above:

class ViewController: UIViewController {
    @IBAction func handleTap(_ sender: Any) {
        DispatchQueue.global(qos: .userInitiated).async { [unowned self] in
            self.compute()
        }
    }

    private func compute() -> Void {
        // Pretending to post-process a large image.
        var counter = 0
        for _ in 0..<9999999 {
            counter += 1
        }
    }
}

Unless specified, a snippet of code will usually default to execute on the Main Queue, so in order to force it to execute on a different thread, we'll wrap our compute call inside of an asynchronous closure that gets submitted to the DispatchQueue.global queue. Keep in mind that we aren't really managing threads here. We're submitting tasks (in the form of closures or blocks) to the desired queue with the assumption that it is guaranteed to execute at some point in time. The queue decides which thread to allocate the task to, and it does all the hard work of assessing system requirements and managing the actual threads. This is the magic of Grand Central Dispatch. As the old adage goes, you can't improve what you can't measure. So we measured our truly terrible button click handler, and now that we've improved it, we'll measure it once again to get some concrete data with regards to performance.

Looking at the profiler again, it's quite clear to us that this is a huge improvement. The task takes an identical amount of time, but this time, it's happening in the background without locking up the UI. Even though our app is doing the same amount of work, the perceived performance is much better because the user will be free to do other things while the app is processing.

You may have noticed that we accessed a global queue of .userInitiated priority. This is an attribute we can use to give our tasks a sense of urgency. If we run the same task on a global queue of and pass it a qos attribute of background , iOS will think it's a utility task, and thus allocate fewer resources to execute it. So, while we don't have control over when our tasks get executed, we do have control over their priority.

A Note on Main Thread vs. Main Queue

You might be wondering why the Profiler shows "Main Thread" and why we're referring to it as the "Main Queue". If you refer back to the GCD architecture we described above, the Main Queue is solely responsible for managing the Main Thread. The Dispatch Queues section in the Concurrency Programming Guide says that "the main dispatch queue is a globally available serial queue that executes tasks on the application’s main thread. Because it runs on your application’s main thread, the main queue is often used as a key synchronization point for an application."

The terms "execute on the Main Thread" and "execute on the Main Queue" can be used interchangeably.


Concurrent Queues

So far, our tasks have been executed exclusively in a serial manner. DispatchQueue.main is by default a serial queue, and DispatchQueue.global gives you four concurrent dispatch queues depending on the priority parameter you pass in.

Let's say we want to take five images, and have our app process them all in parallel on background threads. How would we go about doing that? We can spin up a custom concurrent queue with an identifier of our choosing, and allocate those tasks there. All that's required is the .concurrent attribute during the construction of the queue.

class ViewController: UIViewController {
    let queue = DispatchQueue(label: "com.app.concurrentQueue", attributes: .concurrent)
    let images: [UIImage] = [UIImage].init(repeating: UIImage(), count: 5)

    @IBAction func handleTap(_ sender: Any) {
        for img in images {
            queue.async { [unowned self] in
                self.compute(img)
            }
        }
    }

    private func compute(_ img: UIImage) -> Void {
        // Pretending to post-process a large image.
        var counter = 0
        for _ in 0..<9999999 {
            counter += 1
        }
    }
}

Running that through the profiler, we can see that the app is now spinning up 5 discrete threads to parallelize a for-loop.

Parallelization of N Tasks

So far, we've looked at pushing computationally expensive task(s) onto background threads without clogging up the UI thread. But what about executing parallel tasks with some restrictions? How can Spotify download multiple songs in parallel, while limiting the maximum number up to 3? We can go about this in a few ways, but this is a good time to explore another important construct in multithreaded programming: semaphores.

Semaphores are signaling mechanisms. They are commonly used to control access to a shared resource. Imagine a scenario where a thread can lock access to a certain section of the code while it executes it, and unlocks after it's done to let other threads execute the said section of the code. You would see this type of behavior in database writes and reads, for example. What if you want only one thread writing to a database and preventing any reads during that time? This is a common concern in thread-safety called Readers-writer lock. Semaphores can be used to control concurrency in our app by allowing us to lock n number of threads.

let kMaxConcurrent = 3 // Or 1 if you want strictly ordered downloads!
let semaphore = DispatchSemaphore(value: kMaxConcurrent)
let downloadQueue = DispatchQueue(label: "com.app.downloadQueue", attributes: .concurrent)

class ViewController: UIViewController {
    @IBAction func handleTap(_ sender: Any) {
        for i in 0..<15 {
            downloadQueue.async { [unowned self] in
                // Lock shared resource access
                semaphore.wait()

                // Expensive task
                self.download(i + 1)

                // Update the UI on the main thread, always!
                DispatchQueue.main.async {
                    tableView.reloadData()

                    // Release the lock
                    semaphore.signal()
                }
            }
        }
    }

    func download(_ songId: Int) -> Void {
        var counter = 0

        // Simulate semi-random download times.
        for _ in 0..<Int.random(in: 999999...10000000) {
            counter += songId
        }
    }
}

Notice how we've effectively restricted our download system to limit itself to k number of downloads. The moment one download finishes (or thread is done executing), it decrements the semaphore, allowing the managing queue to spawn another thread and start downloading another song. You can apply a similar pattern to database transactions when dealing with concurrent reads and writes.

Semaphores usually aren't necessary for code like the one in our example, but they become more powerful when you need to enforce synchronous behavior whille consuming an asynchronous API. The above could would work just as well with a custom NSOperationQueue with a maxConcurrentOperationCount, but it's a worthwhile tangent regardless.


Finer Control with OperationQueue

GCD is great when you want to dispatch one-off tasks or closures into a queue in a 'set-it-and-forget-it' fashion, and it provides a very lightweight way of doing so. But what if we want to create a repeatable, structured, long-running task that produces associated state or data? And what if we want to model this chain of operations such that they can be cancelled, suspended and tracked, while still working with a closure-friendly API? Imagine an operation like this:

This would be quite cumbersome to achieve with GCD. We want a more modular way of defining a group of tasks while maintaining readability and also exposing a greater amount of control. In this case, we can use Operation objects and queue them onto an OperationQueue, which is a high-level wrapper around DispatchQueue. Let's look at some of the benefits of using these abstractions and what they offer in comparison to the lower-level GCI API:

  • You may want to create dependencies between tasks, and while you could do this via GCD, you're better off defining them concretely as Operation objects, or units of work, and pushing them onto your own queue. This would allow for maximum reusability since you may use the same pattern elsewhere in an application.
  • The Operation and OperationQueue classes have a number of properties that can be observed, using KVO (Key Value Observing). This is another important benefit if you want to monitor the state of an operation or operation queue.
  • Operations can be paused, resumed, and cancelled. Once you dispatch a task using Grand Central Dispatch, you no longer have control or insight into the execution of that task. The Operation API is more flexible in that respect, giving the developer control over the operation's life cycle.
  • OperationQueue allows you to specify the maximum number of queued operations that can run simultaneously, giving you a finer degree of control over the concurrency aspects.

The usage of Operation and OperationQueue could fill an entire blog post, but let's look at a quick example of what modeling dependencies looks like. (GCD can also create dependencies, but you're better off dividing up large tasks into a series of composable sub-tasks.) In order to create a chain of operations that depend on one another, we could do something like this:

class ViewController: UIViewController {
    var queue = OperationQueue()
    var rawImage = UIImage? = nil
    let imageUrl = URL(string: "https://example.com/portrait.jpg")!
    @IBOutlet weak var imageView: UIImageView!

    let downloadOperation = BlockOperation {
        let image = Downloader.downloadImageWithURL(url: imageUrl)
        OperationQueue.main.async {
            self.rawImage = image
        }
    }

    let filterOperation = BlockOperation {
        let filteredImage = ImgProcessor.addGaussianBlur(self.rawImage)
        OperationQueue.main.async {
            self.imageView = filteredImage
        }
    }

    filterOperation.addDependency(downloadOperation)

    [downloadOperation, filterOperation].forEach {
        queue.addOperation($0)
     }
}

So why not opt for a higher level abstraction and avoid using GCD entirely? While GCD is ideal for inline asynchronous processing, Operation provides a more comprehensive, object-oriented model of computation for encapsulating all of the data around structured, repeatable tasks in an application. Developers should use the highest level of abstraction possible for any given problem, and for scheduling consistent, repeated work, that abstraction is Operation. Other times, it makes more sense to sprinkle in some GCD for one-off tasks or closures that we want to fire. We can mix both OperationQueue and GCD to get the best of both worlds.


The Cost of Concurrency

DispatchQueue and friends are meant to make it easier for the application developer to execute code concurrently. However, these technologies do not guarantee improvements to the efficiency or responsiveness in an application. It is up to you to use queues in a manner that is both effective and does not impose an undue burden on other resources. For example, it's totally viable to create 10,000 tasks and submit them to a queue, but doing so would allocate a nontrivial amount of memory and introduce a lot of overhead for the allocation and deallocation of operation blocks. This is the opposite of what you want! It's best to profile your app thoroughly to ensure that concurrency is enhancing your app's performance and not degrading it.

We've talked about how concurrency comes at a cost in terms of complexity and allocation of system resources, but introducing concurrency also brings a host of other risks like:

  • Deadlock: A situation where a thread locks a critical portion of the code and can halt the application's run loop entirely. In the context of GCD, you should be very careful when using the dispatchQueue.sync { } calls as you could easily get yourself in situations where two synchronous operations can get stuck waiting for each other.
  • Priority Inversion: A condition where a lower priority task blocks a high priority task from executing, which effectively inverts their priorities. GCD allows for different levels of priority on its background queues, so this is quite easily a possibility.
  • Producer-Consumer Problem: A race condition where one thread is creating a data resource while another thread is accessing it. This is a synchronization problem, and can be solved using locks, semaphores, serial queues, or a barrier dispatch if you're using concurrent queues in GCD.
  • ...and many other sorts of locking and data-race conditions that are hard to debug! Thread safety is of the utmost concern when dealing with concurrency.

Parting Thoughts + Further Reading

If you've made it this far, I applaud you. Hopefully this article gives you a lay of the land when it comes to multithreading techniques on iOS, and how you can use some of them in your app. We didn't get to cover many of the lower-level constructs like locks, mutexes and how they help us achieve synchronization, nor did we get to dive into concrete examples of how concurrency can hurt your app. We'll save those for another day, but you can dig into some additional reading and videos if you're eager to dive deeper.




on

Committed to the wrong branch? -, @{upstream}, and @{-1} to the rescue

I get into this situation sometimes. Maybe you do too. I merge feature work into a branch used to collect features, and then continue development but on that branch instead of back on the feature branch

git checkout feature
# ... bunch of feature commits ...
git push
git checkout qa-environment
git merge --no-ff --no-edit feature
git push
# deploy qa-environment to the QA remote environment
# ... more feature commits ...
# oh. I'm not committing in the feature branch like I should be

and have to move those commits to the feature branch they belong in and take them out of the throwaway accumulator branch

git checkout feature
git cherry-pick origin/qa-environment..qa-environment
git push
git checkout qa-environment
git reset --hard origin/qa-environment
git merge --no-ff --no-edit feature
git checkout feature
# ready for more feature commits

Maybe you prefer

git branch -D qa-environment
git checkout qa-environment

over

git checkout qa-environment
git reset --hard origin/qa-environment

Either way, that works. But it'd be nicer if we didn't have to type or even remember the branches' names and the remote's name. They are what is keeping this from being a context-independent string of commands you run any time this mistake happens. That's what we're going to solve here.

Shorthands for longevity

I like to use all possible natively supported shorthands. There are two broad motivations for that.

  1. Fingers have a limited number of movements in them. Save as many as possible left late in life.
  2. Current research suggests that multitasking has detrimental effects on memory. Development tends to be very heavy on multitasking. Maybe relieving some of the pressure on quick-access short term memory (like knowing all relevant branch names) add up to leave a healthier memory down the line.

First up for our scenario: the - shorthand, which refers to the previously checked out branch. There are a few places we can't use it, but it helps a lot:

Bash
# USING -

git checkout feature
# hack hack hack
git push
git checkout qa-environment
git merge --no-ff --no-edit -        # ????
git push
# hack hack hack
# whoops
git checkout -        # now on feature ???? 
git cherry-pick origin/qa-environment..qa-environment
git push
git checkout - # now on qa-environment ????
git reset --hard origin/qa-environment
git merge --no-ff --no-edit -        # ????
git checkout -                       # ????
# on feature and ready for more feature commits
Bash
# ORIGINAL

git checkout feature
# hack hack hack
git push
git checkout qa-environment
git merge --no-ff --no-edit feature
git push
# hack hack hack
# whoops
git checkout feature
git cherry-pick origin/qa-environment..qa-environment
git push
git checkout qa-environment
git reset --hard origin/qa-environment
git merge --no-ff --no-edit feature
git checkout feature
# ready for more feature commits

We cannot use - when cherry-picking a range

> git cherry-pick origin/-..-
fatal: bad revision 'origin/-..-'

> git cherry-pick origin/qa-environment..-
fatal: bad revision 'origin/qa-environment..-'

and even if we could we'd still have provide the remote's name (here, origin).

That shorthand doesn't apply in the later reset --hard command, and we cannot use it in the branch -D && checkout approach either. branch -D does not support the - shorthand and once the branch is deleted checkout can't reach it with -:

# assuming that branch-a has an upstream origin/branch-a
> git checkout branch-a
> git checkout branch-b
> git checkout -
> git branch -D -
error: branch '-' not found.
> git branch -D branch-a
> git checkout -
error: pathspec '-' did not match any file(s) known to git

So we have to remember the remote's name (we know it's origin because we are devoting memory space to knowing that this isn't one of those times it's something else), the remote tracking branch's name, the local branch's name, and we're typing those all out. No good! Let's figure out some shorthands.

@{-<n>} is hard to say but easy to fall in love with

We can do a little better by using @{-<n>} (you'll also sometimes see it referred to be the older @{-N}). It is a special construct for referring to the nth previously checked out ref.

> git checkout branch-a
> git checkout branch-b
> git rev-parse --abbrev-rev @{-1} # the name of the previously checked out branch
branch-a
> git checkout branch-c
> git rev-parse --abbrev-rev @{-2} # the name of branch checked out before the previously checked out one
branch-a

Back in our scenario, we're on qa-environment, we switch to feature, and then want to refer to qa-environment. That's @{-1}! So instead of

git cherry-pick origin/qa-environment..qa-environment

We can do

git cherry-pick origin/qa-environment..@{-1}

Here's where we are (🎉 marks wins from -, 💥 marks the win from @{-1})

Bash
# USING - AND @{-1}

git checkout feature
# hack hack hack
git push
git checkout qa-environment
git merge --no-ff --no-edit -                # ????
git push
# hack hack hack
# whoops
git checkout -                               # ????
git cherry-pick origin/qa-environment..@{-1} # ????
git push
git checkout -                               # ????
git reset --hard origin/qa-environment
git merge --no-ff --no-edit -                # ????
git checkout -                               # ????
# ready for more feature commits
Bash
# ORIGINAL

git checkout feature
# hack hack hack
git push
git checkout qa-environment
git merge --no-ff --no-edit feature
git push
# hack hack hack
# whoops
git checkout feature
git cherry-pick origin/qa-environment..qa-environment
git push
git checkout qa-environment
git reset --hard origin/qa-environment
git merge --no-ff --no-edit feature
git checkout feature
# ready for more feature commits

One down, two to go: we're still relying on memory for the remote's name and the remote branch's name and we're still typing both out in full. Can we replace those with generic shorthands?

@{-1} is the ref itself, not the ref's name, we can't do

> git cherry-pick origin/@{-1}..@{-1}
origin/@{-1}
fatal: ambiguous argument 'origin/@{-1}': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'

because there is no branch origin/@{-1}. For the same reason, @{-1} does not give us a generalized shorthand for the scenario's later git reset --hard origin/qa-environment command.

But good news!

Do @{u} @{push}

@{upstream} or its shorthand @{u} is the remote branch a that would be pulled from if git pull were run. @{push} is the remote branch that would be pushed to if git push was run.

> git checkout branch-a
Switched to branch 'branch-a'
Your branch is ahead of 'origin/branch-a' by 3 commits.
  (use "git push" to publish your local commits)
> git reset --hard origin/branch-a
HEAD is now at <the SHA origin/branch-a is at>

we can

> git checkout branch-a
Switched to branch 'branch-a'
Your branch is ahead of 'origin/branch-a' by 3 commits.
  (use "git push" to publish your local commits)
> git reset --hard @{u}                                # <-- So Cool!
HEAD is now at <the SHA origin/branch-a is at>

Tacking either onto a branch name will give that branch's @{upstream} or @{push}. For example

git checkout branch-a@{u}

is the branch branch-a pulls from.

In the common workflow where a branch pulls from and pushes to the same branch, @{upstream} and @{push} will be the same, leaving @{u} as preferable for its terseness. @{push} shines in triangular workflows where you pull from one remote and push to another (see the external links below).

Going back to our scenario, it means short, portable commands with a minimum human memory footprint. (🎉 marks wins from -, 💥 marks the win from @{-1}, 😎 marks the wins from @{u}.)

Bash
# USING - AND @{-1} AND @{u}

git checkout feature
# hack hack hack
git push
git checkout qa-environment
git merge --no-ff --no-edit -    # ????
git push
# hack hack hack
# whoops
git checkout -                   # ????
git cherry-pick @{-1}@{u}..@{-1} # ????????
git push
git checkout -                   # ????
git reset --hard @{u}            # ????
git merge --no-ff --no-edit -    # ????
git checkout -                   # ????
# ready for more feature commits
Bash
# ORIGINAL

git checkout feature
# hack hack hack
git push
git checkout qa-environment
git merge --no-ff --no-edit feature
git push
# hack hack hack
# whoops
git checkout feature
git cherry-pick origin/qa-environment..qa-environment
git push
git checkout qa-environment
git reset --hard origin/qa-environment
git merge --no-ff --no-edit feature
git checkout feature
# ready for more feature commits

Make the things you repeat the easiest to do

Because these commands are generalized, we can run some series of them once, maybe

git checkout - && git reset --hard @{u} && git checkout -

or

git checkout - && git cherry-pick @{-1}@{u}.. @{-1} && git checkout - && git reset --hard @{u} && git checkout -

and then those will be in the shell history just waiting to be retrieved and run again the next time, whether with CtrlR incremental search or history substring searching bound to the up arrow or however your interactive shell is configured. Or make it an alias, or even better an abbreviation if your interactive shell supports them. Save the body wear and tear, give memory a break, and level up in Git.

And keep going

The GitHub blog has a good primer on triangular workflows and how they can polish your process of contributing to external projects.

The FreeBSD Wiki has a more in-depth article on triangular workflow process (though it doesn't know about @{push} and @{upstream}).

The construct @{-<n>} and the suffixes @{push} and @{upstream} are all part of the gitrevisions spec. Direct links to each:



    • Code
    • Front-end Engineering
    • Back-end Engineering

    on

    TrailBuddy: Using AI to Create a Predictive Trail Conditions App

    Viget is full of outdoor enthusiasts and, of course, technologists. For this year's Pointless Weekend, we brought these passions together to build TrailBuddy. This app aims to solve that eternal question: Is my favorite trail dry so I can go hike/run/ride?

    While getting muddy might rekindle fond childhood memories for some, exposing your gear to the elements isn’t great – it’s bad for your equipment and can cause long-term, and potentially expensive, damage to the trail.

    There are some trail apps out there but we wanted one that would focus on current conditions. Currently, our favorites trail apps, like mtbproject.com, trailrunproject.com, and hikingproject.com -- all owned by REI, rely on user-reported conditions. While this can be effective, the reports are frequently unreliable, as condition reports can become outdated in just a few days.

    Our goal was to solve this problem by building an app that brought together location, soil type, and weather history data to create on-demand condition predictions for any trail in the US.

    We built an initial version of TrailBuddy by tapping into several readily-available APIs, then running the combined data through a machine learning algorithm. (Oh, and also by bringing together a bunch of smart and motivated people and combining them with pizza and some of the magic that is our Pointless Weekends. We'll share the other Pointless Project, Scurry, with you soon.)

    The quest for data.

    We knew from the start this app would require data from a number of sources. As previously mentioned, we used REI’s APIs (i.e. https://www.hikingproject.com/data) as the source for basic trail information. We used the trails’ latitude and longitude coordinates as well as its elevation to query weather and soil type. We also found data points such as a trail’s total distance to be relevant to our app users and decided to include that on the front-end, too. Since we wanted to go beyond relying solely on user-reported metrics, which is how REI’s current MTB project works, we came up with a list of factors that could affect the trail for that day.

    First on that list was weather.

    We not only considered the impacts of the current forecast, but we also looked at the previous day’s forecast. For example, it’s safe to assume that if it’s currently raining or had been raining over the last several days, it would likely lead to muddy and unfavorable conditions for that trail. We utilized the DarkSky API (https://darksky.net/dev) to get the weather forecasts for that day, as well as the records for previous days. This included expected information, like temperature and precipitation chance. It also included some interesting data points that we realized may be factors, like precipitation intensity, cloud cover, and UV index. 

    But weather alone can’t predict how muddy or dry a trail will be. To determine that for sure, we also wanted to use soil data to help predict how well a trail’s unique soil composition recovers after precipitation. Similar amounts of rain on trails of very different soil types could lead to vastly different trail conditions. A more clay-based soil would hold water much longer, and therefore be much more unfavorable, than loamy soil. Finding a reliable source for soil type and soil drainage proved incredibly difficult. After many hours, we finally found a source through the USDA that we could use. As a side note—the USDA keeps track of lots of data points on soil information that’s actually pretty interesting! We can’t say we’re soil experts but, we felt like we got pretty close.

    We used Whimsical to build our initial wireframes.

    Putting our design hats on.

    From the very first pitch for this app, TrailBuddy’s main differentiator to peer trail resources is its ability to surface real-time information, reliably, and simply. For as complicated as the technology needed to collect and interpret information, the front-end app design needed to be clean and unencumbered.

    We thought about how users would naturally look for information when setting out to find a trail and what factors they’d think about when doing so. We posed questions like:

    • How easy or difficult of a trail are they looking for?
    • How long is this trail?
    • What does the trail look like?
    • How far away is the trail in relation to my location?
    • For what activity am I needing a trail for?
    • Is this a trail I’d want to come back to in the future?

    By putting ourselves in our users’ shoes we quickly identified key features TrailBuddy needed to have to be relevant and useful. First, we needed filtering, so users could filter between difficulty and distance to narrow down their results to fit the activity level. Next, we needed a way to look up trails by activity type—mountain biking, hiking, and running are all types of activities REI’s MTB API tracks already so those made sense as a starting point. And lastly, we needed a way for the app to find trails based on your location; or at the very least the ability to find a trail within a certain distance of your current location.

    We used Figma to design, prototype, and gather feedback on TrailBuddy.

    Using machine learning to predict trail conditions.

    As stated earlier, none of us are actual soil or data scientists. So, in order to achieve the real-time conditions reporting TrailBuddy promised, we’d decided to leverage machine learning to make predictions for us. Digging into the utility of machine learning was a first for all of us on this team. Luckily, there was an excellent tutorial that laid out the basics of building an ML model in Python. Provided a CSV file with inputs in the left columns, and the desired output on the right, the script we generated was able to test out multiple different model strategies, and output the effectiveness of each in predicting results, shown below.

    We assembled all of the historical weather and soil data we could find for a given latitude/longitude coordinate, compiled a 1000 * 100 sized CSV, ran it through the Python evaluator, and found that the CART and SVM models consistently outranked the others in terms of predicting trail status. In other words, we found a working model for which to run our data through and get (hopefully) reliable predictions from. The next step was to figure out which data fields were actually critical in predicting the trail status. The more we could refine our data set, the faster and smarter our predictive model could become.

    We pulled in some Ruby code to take the original (and quite massive) CSV, and output smaller versions to test with. Now again, we’re no data scientists here but, we were able to cull out a good majority of the data and still get a model that performed at 95% accuracy.

    With our trained model in hand, we could serialize that to into a model.pkl file (pkl stands for “pickle”, as in we’ve “pickled” the model), move that file into our Rails app along with it a python script to deserialize it, pass in a dynamic set of data, and generate real-time predictions. At the end of the day, our model has a propensity to predict fantastic trail conditions (about 99% of the time in fact…). Just one of those optimistic machine learning models we guess.

    Where we go from here.

    It was clear that after two days, our team still wanted to do more. As a first refinement, we’d love to work more with our data set and ML model. Something that was quite surprising during the weekend was that we found we could remove all but two days worth of weather data, and all of the soil data we worked so hard to dig up, and still hit 95% accuracy. Which … doesn’t make a ton of sense. Perhaps the data we chose to predict trail conditions just isn’t a great empirical predictor of trail status. While these are questions too big to solve in just a single weekend, we'd love to spend more time digging into this in a future iteration.



    • News & Culture

    on

    A Viget Exploration: How Tech Can Help in a Pandemic

    Viget Explorations have always been the result of our shared curiosities. They’re usually a spontaneous outcome of team downtime and a shared problem we’ve experienced. We use our Explorations to pursue our diverse interests and contribute to the conversations about building a better digital world.

    As the COVID-19 crisis emerged, we were certainly experiencing a shared problem. As a way to keep busy and manage our anxieties, a small team came together to dive into how technology has helped, and, unfortunately, hindered the community response to the current pandemic.

    We started by researching the challenges we saw: information overload, a lack of clarity, individual responsibility, and change. Then we brainstormed possible technical solutions that could further improve how communities respond to a pandemic. Click here to see our Exploration on some possible ways to take the panic out of pandemics.

    While we aren’t currently pursuing the solutions outlined in the Exploration, we’d love to hear what you think about these approaches, as well as any ideas you have for how technology can help address the outlined challenges.

    Please note, this Exploration doesn’t provide medical information. Visit the Center for Disease Control’s website for current information and COVID-19, its symptoms, and treatments.

    At Viget, we’re adjusting to this crisis for the safety of our clients, our staff, and our communities. If you’d like to hear from Viget's co-founder, Brian Williams, you can read his article on our response to the situation.



    • News & Culture

    on

    CLI Equivalents for Common MAMP PRO and Sequel Pro Tasks

    Working on website front ends I sometimes use MAMP PRO to manage local hosts and Sequel Pro to manage databases. Living primarily in my text editor, a terminal, and a browser window, moving to these click-heavy dedicated apps can feel clunky. Happily, the tasks I have most frequently turned to those apps for —starting and stopping servers, creating new hosts, and importing, exporting, deleting, and creating databases— can be done from the command line.

    I still pull up MAMP PRO if I need to change a host's PHP version or work with its other more specialized settings, or Sequel Pro to quickly inspect a database, but for the most part I can stay on the keyboard and in my terminal. Here's how:

    Command Line MAMP PRO

    You can start and stop MAMP PRO's servers from the command line. You can even do this when the MAMP PRO desktop app isn't open.

    Note: MAMP PRO's menu icon will not change color to reflect the running/stopped status when the status is changed via the command line.

    • Start the MAMP PRO servers:
    /Applications/MAMP PRO.app/Contents/MacOS/MAMP PRO cmd startServers
    • Stop the MAMP PRO servers:
    /Applications/MAMP PRO.app/Contents/MacOS/MAMP PRO cmd stopServers
    • Create a host (replace host_name and root_path):
    /Applications/MAMP PRO.app/Contents/MacOS/MAMP PRO cmd createHost host_name root_path

    MAMP PRO-friendly Command Line Sequel Pro

    Note: if you don't use MAMP PRO, just replace the /Applications/MAMP/Library/bin/mysql with mysql.

    In all of the following commands, replace username with your user name (locally this is likely root) and database_name with your database name. The -p (password) flag with no argument will trigger an interactive password prompt. This is more secure than including your password in the command itself (like -pYourPasswordHere). Of course, if you're using the default password root is not particular secure to begin with so you might just do -pYourPasswordHere.

    Setting the -h (host) flag to localhost or 127.0.0.1 tells mysql to look at what's on localhost. With the MAMP PRO servers running, that will be the MAMP PRO databases.

    # with the MAMP PRO servers running, these are equivalent:
    # /Applications/MAMP/Library/bin/mysql -h 127.0.0.1 other_options
    # and
    # /Applications/MAMP/Library/bin/mysql -h localhost other_options
    
    /Applications/MAMP/Library/bin/mysql mysql_options # enter. opens an interactive mysql session
    mysql> some command; # don't forget the semicolon
    mysql> exit;
    • Create a local database
    # with the MAMP PRO servers running
    # replace `username` with your username, which is `root` by default
    /Applications/MAMP/Library/bin/mysql -h localhost -u username -p -e "create database database_name"

    or

    # with the MAMP PRO servers running
    # replace `username` (`root` by default) and `database_name`
    /Applications/MAMP/Library/bin/mysql -h localhost -u username -p # and then enter
    mysql> create database database_name; # don't forget the semicolon
    mysql> exit

        MAMP PRO's databases are stored in /Library/Application Support/appsolute/MAMP PRO/db so to confirm that it worked you can

    ls /Library/Application Support/appsolute/MAMP PRO/db
    # will output the available mysql versions. For example I have
    mysql56_2018-11-05_16-25-13     mysql57
    
    # If it isn't clear which one you're after, open the main MAMP PRO and click
    # on the MySQL "servers and services" item. In my case it shows "Version: 5.7.26"
    
    # Now look in the relevant MySQL directory
    ls /Library/Application Support/appsolute/MAMP PRO/db/mysql57
    # the newly created database should be in the list
    • Delete a local database
    # with the MAMP PRO servers running
    # replace `username` (`root` by default) and `database_name`
    /Applications/MAMP/Library/bin/mysql -h localhost -u username -p -e "drop database database_name"
    • Export a dump of a local database. Note that this uses mysqldump not mysql.
    # to export an uncompressed file
    # replace `username` (`root` by default) and `database_name`
    /Applications/MAMP/Library/bin/mysqldump -h localhost -u username -p database_name > the/output/path.sql
    
    # to export a compressed file
    # replace `username` (`root` by default) and `database_name`
    /Applications/MAMP/Library/bin/mysqldump -h localhost -u username -p database_name | gzip -c > the/output/path.gz

    • Export a local dump from an external database over SSH. Note that this uses mysqldump not mysql.

    # replace `ssh-user`, `ssh_host`, `mysql_user`, `database_name`, and the output path
    
    # to end up with an uncompressed file
    ssh ssh_user@ssh_host "mysqldump -u mysql_user -p database_name | gzip -c" | gunzip > the/output/path.sql
    
    # to end up with a compressed file
    ssh ssh_user@ssh_host "mysqldump -u mysql_user -p database_name | gzip -c" > the/output/path.gz
    • Import a local database dump into a local database
    # with the MAMP PRO servers running
    # replace `username` (`root` by default) and `database_name`
    /Applications/MAMP/Library/bin/mysql -h localhost -u username -p database_name < the/dump/path.sql
    • Import a local database dump into a remote database over SSH. Use care with this one. But if you are doing it with Sequel Pro —maybe you are copying a Craft site's database from a production server to a QA server— you might as well be able to do it on the command line.
    ssh ssh_user@ssh_host "mysql -u username -p remote_database_name" < the/local/dump/path.sql


    For me, using the command line instead of the MAMP PRO and Sequel Pro GUI means less switching between keyboard and mouse, less opening up GUI features that aren't typically visible on my screen, and generally better DX. Give it a try! And while MAMP Pro's CLI is limited to the essentials, command line mysql of course knows no limits. If there's something else you use Sequel Pro for, you may be able to come up with a mysql CLI equivalent you like even better.



    • Code
    • Front-end Engineering
    • Back-end Engineering

    on

    Pursuing A Professional Certification In Scrum

    Professional certifications have become increasingly popular in this age of career switchers and the freelance gig economy. A certification can be a useful way to advance your skill set quickly or make your resume stand out, which can be especially important for those trying to break into a new industry or attract business while self-employed. Whatever your reason may be for pursuing a professional certificate, there is one question only you can answer for yourself: is it worth it?

    Finding first-hand experiences from professionals with similar career goals and passions was the most helpful research I used to answer that question for myself. So, here’s mine; why I decided to get Scrum certified, how I evaluated my options, and if it was really worth it.

    A shift in mindset

    My background originates in brand strategy where it’s typical for work to follow a predictable order, each step informing the next. This made linear techniques like water-fall timelines, completing one phase of work in its entirety before moving onto the next, and documenting granular tasks weeks in advance helpful and easy to implement. When I made the move to more digitally focused work, tasks followed a much looser set of ‘typical’ milestones. While the general outline remained the same (strategy, design, development, launch) there was a lot more overlap with how tasks informed each other, and would keep informing and re-informing as an iterative workflow would encourage.

    Trying to fit a very fluid process into my very stiff linear approach to project planning didn’t work so well. I didn’t have the right strategies to manage risks in a productive way without feeling like the whole project was off track; with the habit of account for granular details all the time, I struggled to lean on others to help define what we should work on and when, and being okay if that changed once, or twice, or three times. Everything I learned about the process of product development came from learning on the job and making a ton of mistakes—and I knew I wanted to get better.

    Photo by Christin Hume on Unsplash

    I was fortunate enough to work with a group of developers who were looking to make a change, too. Being ‘agile’-enthusiasts, this group of developers were desperately looking for ways to infuse our approach to product work with agile-minded principles (the broad definition of ‘agile’ comes from ‘The Agile Manifesto’, which has influenced frameworks for organizing people and information, often applied in product development). This not only applied to how I worked with them, but how they worked with each other, and the way we all onboarded clients to these new expectations. This was a huge eye opener to me. Soon enough, I started applying these agile strategies to my day-to-day— running stand-ups, setting up backlogs, and reorganizing the way I thought about work output. It’s from this experience that I decided it may be worth learning these principles more formally.

    The choice to get certified

    There is a lot of literature out there about agile methodologies and a lot to be learned from casual research. This benefitted me for a while until I started to work on more complicated projects, or projects with more ambitious feature requests. My decision to ultimately pursue a formal agile certification really came down to three things:

    1. An increased use of agile methods across my team. Within my day-to-day I would encounter more team members who were familiar with these tactics and wanted to use them to structure the projects they worked on.
    2. The need for a clear definition of what processes to follow. I needed to grasp a real understanding of how to implement agile processes and stay consistent with using them to be an effective champion of these principles.
    3. Being able to diversify my experience. Finding ways to differentiate my resume from others with similar experience would be an added benefit to getting a certification. If nothing else, it would demonstrate that I’m curious-minded and proactive about my career.

    To achieve these things, I gravitated towards a more foundational education in a specific agile-methodology. This made Scrum the most logical choice given it’s the basis for many of the agile strategies out there and its dominance in the field.

    Evaluating all the options

    For Scrum education and certification, there are really two major players to consider.

    1. Scrum Alliance - Probably the most well known Scrum organization is Scrum Alliance. They are a highly recognizable organization that does a lot to further the broader understanding of Scrum as a practice.
    2. Scrum.org - Led by the original co-founder of Scrum, Ken Schwaber, Scrum.org is well-respected and touted for its authority in the industry.

    Each has their own approach to teaching and awarding certifications as well as differences in price point and course style that are important to be aware of.

    SCRUM ALLIANCE

    Pros

    • Strong name recognition and leaders in the Scrum field
    • Offers both in-person and online courses
    • Hosts in-person events, webinars, and global conferences
    • Provides robust amounts of educational resources for its members
    • Has specialization tracks for folks looking to apply Scrum to their specific discipline
    • Members are required to keep their skills up to date by earning educational credits throughout the year to retain their certification
    • Consistent information across all course administrators ensuring you'll be set up to succeed when taking your certification test.

    Cons

    • High cost creates a significant barrier to entry (we’re talking in the thousands of dollars here)
    • Courses are required to take the certification test
    • Certification expires after two years, requiring additional investment in time and/or money to retain credentials
    • Difficult to find sample course material ahead of committing to a course
    • Courses are several days long which may mean taking time away from a day job to complete them

    SCRUM.ORG

    Pros

    • Strong clout due to its founder, Ken Schwaber, who is the originator of Scrum
    • Offers in-person classes and self-paced options
    • Hosts in-person events and meetups around the world
    • Provides free resources and materials to the public, including practice tests
    • Has specialization tracks for folks looking to apply Scrum to their specific discipline
    • Minimum score on certification test required to pass; certification lasts for life
    • Lower cost for certification when compared to peers

    Cons

    • Much lesser known to the general public, as compared to its counterpart
    • Less sophisticated educational resources (mostly confined to PDFs or online forums) making digesting the material challenging
    • Practice tests are slightly out of date making them less effective as a study tool
    • Self-paced education is not structured and therefore can’t ensure you’re learning everything you need to know for the test
    • Lack of active and engaging community will leave something to be desired

    Before coming to a decision, it was helpful to me to weigh these pros and cons against a set of criteria. Here’s a helpful scorecard I used to compare the two institutions.

    Scrum Alliance Scrum.org
    Affordability ⚪⚪⚪
    Rigor⚪⚪⚪⚪⚪
    Reputation⚪⚪⚪⚪⚪
    Recognition⚪⚪⚪
    Community⚪⚪⚪
    Access⚪⚪⚪⚪⚪
    Flexibility⚪⚪⚪
    Specialization⚪⚪⚪⚪⚪⚪
    Requirements⚪⚪⚪
    Longevity⚪⚪⚪

    For me, the four areas that were most important to me were:

    • Affordability - I’d be self-funding this certificate so the investment of cost would need to be manageable.
    • Self-paced - Not having a lot of time to devote in one sitting, the ability to chip away at coursework was appealing to me.
    • Reputation - Having a certificate backed by a well-respected institution was important to me if I was going to put in the time to achieve this credential.
    • Access - Because I wanted to be a champion for this framework for others in my organization, having access to resources and materials would help me do that more effectively.

    Ultimately, I decided upon a Professional Scrum Master certification from Scrum.org! The price and flexibility of learning course content were most important to me. I found a ton of free materials on Scrum.org that I could study myself and their practice tests gave me a good idea of how well I was progressing before I committed to the cost of actually taking the test. And, the pedigree of certification felt comparable to that of Scrum Alliance, especially considering that the founder of Scrum himself ran the organization.

    Putting a certificate to good use

    I don’t work in a formal Agile company, and not everyone I work with knows the ins and outs of Scrum. I didn’t use my certification to leverage a career change or new job title. So after all that time, money, and energy, was it worth it?

    I think so. I feel like I use my certification every day and employ many of the principles of Scrum in my day-to-day management of projects and people.

    • Self-organizing teams is really important when fostering trust and collaboration among project members. This means leaning on each other’s past experiences and lessons learned to inform our own approach to work. It also means taking a step back as a project manager to recognize the strengths on your team and trust their lead.
    • Approaching things in bite size pieces is also a best practice I use every day. Even when there isn't a mandated sprint rhythm, breaking things down into effort level, goals, and requirements is an excellent way to approach work confidently and avoid getting too overwhelmed.
    • Retrospectives and stand ups are also absolute musts for Scrum practices, and these can be modified to work for companies and project teams of all shapes and sizes. Keeping a practice of collective communication and reflection will keep a team humming and provides a safe space to vent and improve.
    Photo by Gautam Lakum on Unsplash

    Parting advice

    I think furthering your understanding of industry standards and keeping yourself open to new ways of working will always benefit you as a professional. Professional certifications are readily available and may be more relevant than ever.

    If you’re on this path, good luck! And here are some things to consider:

    • Do your research – With so many educational institutions out there, you can definitely find the right one for you, with the level of rigor you’re looking for.
    • Look for company credits or incentives – some companies cover part or all of the cost for continuing education.
    • Get started ASAP – You don’t need a full certification to start implementing small tactics to your workflows. Implementing learnings gradually will help you determine if it’s really something you want to pursue more formally.




    on

    New regional visas for Australia

    The Australian Government has introduced two new regional visas which requires migrants to commit to life in regional Australia for at least three years. This new visa opens to the door to permanent residency for overseas workers from a wider range of occupations than before — including such occupations as real estate agents, call centre […]

    The post New regional visas for Australia appeared first on Visa Australia - Immigration Lawyers & Registered Migration Agents.




    on

    Occupations that may be taken off or put onto the skilled migration occupation lists

    The Department of Employment, Skills, Small and Family Business is considering removing the following occupations from the Skilled Migration Occupation Lists (Skills List) in March 2020: Careers Counsellor Vehicle Trimmer Business Machine Mechanic Animal Attendants and Trainers Gardener (General) Hairdresser Wood Machinist Massage Therapist Community Worker Diving Instructor (Open Water) Gymnastics Coach or Instructor At […]

    The post Occupations that may be taken off or put onto the skilled migration occupation lists appeared first on Visa Australia - Immigration Lawyers & Registered Migration Agents.




    on

    Visa cancelled due to incorrect information given or provided to the Department of Home Affairs

    It is a requirement that a visa applicant must fill in or complete his or her application form in a manner that all questions are answered, and no incorrect answers are given or provided. There is also a requirement that visa applicants must not provide incorrect information during interviews with the Minister for Immigration (‘Minister’), […]

    The post Visa cancelled due to incorrect information given or provided to the Department of Home Affairs appeared first on Visa Australia - Immigration Lawyers & Registered Migration Agents.



    • Visa Cancellation
    • 1703474 (Refugee) [2017] AATA 2985
    • cancel a visa
    • cancelledvi sa
    • Citizenship and Multicultural Affairs
    • Department of Home Affairs
    • migration act 1958
    • minister for immigration
    • NOICC
    • notice of intention to consider cancellation
    • Sanaee (Migration) [2019] AATA 4506
    • section 109
    • time limits

    on

    Coronavirus (COVID-19) and Visas for Australia

    The World Health Organization has announced that Coronavirus (COVID-19) is a pandemic. The migration situation is changing rapidly throughout Australia. As an Australian citizen or permanent resident, can I still enter Australia? There is no restriction on Australian citizens or permanent residents entering Australia at this stage. However, those arriving in Australia will be required […]

    The post Coronavirus (COVID-19) and Visas for Australia appeared first on Visa Australia - Immigration Lawyers & Registered Migration Agents.




    on

    What can I do if I am on a working holiday or seasonal worker visa in the Coronavirus (COVID-19) crisis?

    Seasonal Worker Programme and Pacific Labour Scheme workers can extend their stay for up to 12 months to work for approved employers as long as pastoral care and accommodation needs of workers are met to minimise health risks to visa holders and the community. Approved employers under the Seasonal Worker Programme and Pacific Labour Scheme […]

    The post What can I do if I am on a working holiday or seasonal worker visa in the Coronavirus (COVID-19) crisis? appeared first on Visa Australia - Immigration Lawyers & Registered Migration Agents.




    on

    Employer sponsored temporary work visas (482 and 457) and Coronavirus (COVID-19)

    If you’re a Temporary Skill Shortage visa holder – what should you do if you have been stood down or your work hours are reduced by your employer? The Australian Government has announced that Temporary Skill Shortage visa holders who have been stood down, but not laid off, will maintain their visa validity and businesses […]

    The post Employer sponsored temporary work visas (482 and 457) and Coronavirus (COVID-19) appeared first on Visa Australia - Immigration Lawyers & Registered Migration Agents.




    on

    Student visa holders and New Zealand citizens in Australia and the Coronavirus (COVID-19) crisis?

    International students who have been in Australia for longer than 12 months who find themselves in financial hardship will be able to access their Australian superannuation. The Government will undertake further engagement with the international education sector who already provide some financial support for international students facing hardship. International students working in supermarkets will have […]

    The post Student visa holders and New Zealand citizens in Australia and the Coronavirus (COVID-19) crisis? appeared first on Visa Australia - Immigration Lawyers & Registered Migration Agents.




    on

    Social Icons Widget 4.0 — Now With a Social Icons Block for Gutenberg Included

    In 2015 we launched Social Icons Widget by WPZOOM with the intent to provide WordPress users with a simple and easy-to-use widget for adding social links to their websites. With over 100k installs at the moment and continuous positive feedback from the users, it kept us motivated to constantly improve and keep updating this free plugin. Now, to keep the […]




    on

    Presence 2.0: Beaver Builder Integration, Dark Skin & More!

    Great news for the users of Presence — our multipurpose theme. We have finally released the long-awaited 2.0 version, which features major changes and improvements. What’s new in Presence 2.0? Beaver Builder Integration Dark Skin New Demo: Organic Shop New Typography and Colors options in the Customizer New Templates in Page Builder Beaver Builder Integration If you have followed recent […]




    on

    If You’re Using Beaver Builder Lite, You Need This Addon

    Hey there, I’m Ben, and I’m a guest author here at WPZOOM. Today I thought I’d share with you my experience of one of their rather awesome plugins, an addon for Beaver Builder. I know the team at WPZOOM are big fans of Beaver Builder, why not? It’s a great page builder with an excellent feature set; chances are if […]




    on

    How to Create an Online Ordering Page for Restaurants with WooCommerce

    Until recently it was something normal for any restaurant to have a well-maintained website. Even so, it seems that for many restaurants this was something difficult to achieve. In these difficult times, for many restaurant owners and other businesses in this field, owning just a simple website is no longer enough. If you still want to remain in business you […]